From patchwork Mon Aug 19 23:28:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13769173 Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 21C271537D4 for ; Mon, 19 Aug 2024 23:30:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724110252; cv=none; b=Fy8U3bofYMd91sj/m5Fsuz6wfHs4ezA0uMKrKBtuVOZmtqAjrvSFLz23rZ14nX6GQkamXAln9Guregfod10lQ43rVhDHsyzEUMEsyP2/QUbwFZA/7zOHsNNGB5FFXN9wGb7mD9EpRCNwhaNkxmzOTdVBUhmJy7KUK9nhmE4ZLrs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724110252; c=relaxed/simple; bh=xMoc4My7QZ8jZ5RQ5B8Hxyx6CrgWlRJp/Q9mFmCwZGs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QJrNQqszbG4B8IwFdols7rUvZJy6Uu8LtThHF8sX458kfO0eCh3lXylSRMcJbfvUaZ6azCrc1f5Dnyb18l8I/jEMHPve8f6D7VCUNvewKv14wi7iAFhfyREPIvFsgXdZTiUCCTxa0TfdlsiqnlhvdifZOh2vMZ81aaAFAF5Eu54= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=BV1TREEH; arc=none smtp.client-ip=209.85.216.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="BV1TREEH" Received: by mail-pj1-f49.google.com with SMTP id 98e67ed59e1d1-2d3d382de43so837063a91.0 for ; Mon, 19 Aug 2024 16:30:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1724110248; x=1724715048; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=A6LTIh3DcVfrLOVFkRBBopt0cmnN3QyNN9FFhzOKYYo=; b=BV1TREEHm2uxD4Qhr/CdoOOtk5nP1l003PQ6320gB+nd8CqlnDNXijML1sjiygTrd5 MC3E605zAxu9fdJyIII8ONQlDhST2ER+B9xVpdgGrHJpeZFAsRxM96/8vsca8o3GyheX hkZOjd/Gj6bg2egLka0Ygw2UNZGEcUgMftcCMTkpfpZwRVsMBNK09G7tllw4tIN+9aIz fxjrqAZCERtJvlRJshSAd+Oc4oMyZ8krOejFkzfnLA4kMNgNfx7HX01bGBo1VMkHysqW H54Ww1IAQbsLbT8+fR8PhDwRMDj0AstdsdVntMbs/dRhCxFMIQGtCwF8qRDq15cnWrIw rOug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724110248; x=1724715048; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=A6LTIh3DcVfrLOVFkRBBopt0cmnN3QyNN9FFhzOKYYo=; b=eW+TpZC3OUUY/hA+1LelOi99wMSqZloMpFxrNPUjjKJyjzQVqxmbsshmQu+MfX7JvW Yee4NRWsWTJgJNDOp+HgX928mzcvQ+fqtWXyNiJa19g776TZhtp2H/3AWncGOBN3YNoJ 1p8sHzpoR6fX2RI21A/Wcnq7IxkRAvlQe8ijrDIFvq8J8rSYGtgaIiOFXZHFzPoS2mlX QEUMrGsrw52jdMSjwnWhxjLc81lvvmR2O9UJq2aGxzxxHk1f5i1VqMllyjdwkT0Cygrw ftIhy3wDKpUSm+mMJI0NrVHw1eqo2UH9Th3l5fs98tjrWksQZkk6p13uOZO7f8L8H73s MUeA== X-Gm-Message-State: AOJu0YyG2YLfMva109gFY3sniEQeyC1ZPOoENYcdqx10zWR4W05iN7Lr sffMOzd2oZgzCoYJorLMCOl8MJ+qOwPU4wZcmnLV1LYj1gOh6NKg0rRHukIcX3fccju74kPSAjm d X-Google-Smtp-Source: AGHT+IHYsRqGLanl5cGrCrR1wkrrhsX4yFqxQT2sJD+4ZZD0wcMYWal2h3gHNJ/2FiU5iyvPr2Ca4A== X-Received: by 2002:a17:902:f545:b0:202:41cb:7d73 with SMTP id d9443c01a7336-20241cb80c9mr32240365ad.11.1724110247580; Mon, 19 Aug 2024 16:30:47 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7c6b61dc929sm8219838a12.40.2024.08.19.16.30.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Aug 2024 16:30:46 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: dw@davidwei.uk, Jens Axboe Subject: [PATCH 1/5] io_uring: encapsulate extraneous wait flags into a separate struct Date: Mon, 19 Aug 2024 17:28:49 -0600 Message-ID: <20240819233042.230956-2-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240819233042.230956-1-axboe@kernel.dk> References: <20240819233042.230956-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Rather than need to pass in 2 or 3 separate arguments, add a struct to encapsulate the timeout and sigset_t parts of waiting. In preparation for adding another argument for waiting. Signed-off-by: Jens Axboe --- io_uring/io_uring.c | 45 ++++++++++++++++++++++++--------------------- 1 file changed, 24 insertions(+), 21 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 20229e72b65c..37053d32c668 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2384,13 +2384,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, return ret; } +struct ext_arg { + size_t argsz; + struct __kernel_timespec __user *ts; + const sigset_t __user *sig; +}; + /* * Wait until events become available, if we don't already have some. The * application must reap them itself, as they reside on the shared cq ring. */ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, - const sigset_t __user *sig, size_t sigsz, - struct __kernel_timespec __user *uts) + struct ext_arg *ext_arg) { struct io_wait_queue iowq; struct io_rings *rings = ctx->rings; @@ -2415,10 +2420,10 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, iowq.cq_tail = READ_ONCE(ctx->rings->cq.head) + min_events; iowq.timeout = KTIME_MAX; - if (uts) { + if (ext_arg->ts) { struct timespec64 ts; - if (get_timespec64(&ts, uts)) + if (get_timespec64(&ts, ext_arg->ts)) return -EFAULT; iowq.timeout = timespec64_to_ktime(ts); @@ -2426,14 +2431,14 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, iowq.timeout = ktime_add(iowq.timeout, io_get_time(ctx)); } - if (sig) { + if (ext_arg->sig) { #ifdef CONFIG_COMPAT if (in_compat_syscall()) - ret = set_compat_user_sigmask((const compat_sigset_t __user *)sig, - sigsz); + ret = set_compat_user_sigmask((const compat_sigset_t __user *)ext_arg->sig, + ext_arg->argsz); else #endif - ret = set_user_sigmask(sig, sigsz); + ret = set_user_sigmask(ext_arg->sig, ext_arg->argsz); if (ret) return ret; @@ -3112,9 +3117,8 @@ static int io_validate_ext_arg(unsigned flags, const void __user *argp, size_t a return 0; } -static int io_get_ext_arg(unsigned flags, const void __user *argp, size_t *argsz, - struct __kernel_timespec __user **ts, - const sigset_t __user **sig) +static int io_get_ext_arg(unsigned flags, const void __user *argp, + struct ext_arg *ext_arg) { struct io_uring_getevents_arg arg; @@ -3123,8 +3127,8 @@ static int io_get_ext_arg(unsigned flags, const void __user *argp, size_t *argsz * is just a pointer to the sigset_t. */ if (!(flags & IORING_ENTER_EXT_ARG)) { - *sig = (const sigset_t __user *) argp; - *ts = NULL; + ext_arg->sig = (const sigset_t __user *) argp; + ext_arg->ts = NULL; return 0; } @@ -3132,15 +3136,15 @@ static int io_get_ext_arg(unsigned flags, const void __user *argp, size_t *argsz * EXT_ARG is set - ensure we agree on the size of it and copy in our * timespec and sigset_t pointers if good. */ - if (*argsz != sizeof(arg)) + if (ext_arg->argsz != sizeof(arg)) return -EINVAL; if (copy_from_user(&arg, argp, sizeof(arg))) return -EFAULT; if (arg.pad) return -EINVAL; - *sig = u64_to_user_ptr(arg.sigmask); - *argsz = arg.sigmask_sz; - *ts = u64_to_user_ptr(arg.ts); + ext_arg->sig = u64_to_user_ptr(arg.sigmask); + ext_arg->argsz = arg.sigmask_sz; + ext_arg->ts = u64_to_user_ptr(arg.ts); return 0; } @@ -3246,15 +3250,14 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit, } mutex_unlock(&ctx->uring_lock); } else { - const sigset_t __user *sig; - struct __kernel_timespec __user *ts; + struct ext_arg ext_arg = { .argsz = argsz }; - ret2 = io_get_ext_arg(flags, argp, &argsz, &ts, &sig); + ret2 = io_get_ext_arg(flags, argp, &ext_arg); if (likely(!ret2)) { min_complete = min(min_complete, ctx->cq_entries); ret2 = io_cqring_wait(ctx, min_complete, flags, - sig, argsz, ts); + &ext_arg); } } From patchwork Mon Aug 19 23:28:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13769174 Received: from mail-oa1-f48.google.com (mail-oa1-f48.google.com [209.85.160.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 66A9615A865 for ; Mon, 19 Aug 2024 23:30:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724110252; cv=none; b=hpH1YjSvnArYNumy3Cuup3aMC6mb17FLJizjRTS4wC3+UcxVlP4ggxcyHxZYYjs1UconmjE0F3mAQlTLmmrGW4+UIDDK5Nj5Sb2ceqBcLAX4HQiN5sgXgIkKLn8doaHJQRNlZABByj60xKRq6h2t5CR3iPKrh9Ocgvy8ajqxoPU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724110252; c=relaxed/simple; bh=VQY2AbWmYzmJa+BvEd4NBvoFPgkwUcD+DdlnfHAS3AA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WV9lGCN+25gar9f8RQ/ycF0csTJSkh94dT960UIm8ER0btauTtCw9rWdpft+6GBZ6t23YA49yhU8i+iueVa7UVdgUISRRavW9pEWZxvVMr0A8QG1OwIO88aKjIoQ+/QIgMiWsHkW0RL2GIZ2KExTF1LxC8/yKSTjN8nEtdUl+eE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=uJ1gF8rY; arc=none smtp.client-ip=209.85.160.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="uJ1gF8rY" Received: by mail-oa1-f48.google.com with SMTP id 586e51a60fabf-26110765976so651034fac.1 for ; Mon, 19 Aug 2024 16:30:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1724110249; x=1724715049; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jFCcMIWHME8JN9Gx6FSrjmoKIn8+i+WeAhjeL4V+qIE=; b=uJ1gF8rY8WfuFWn3IgKEThzVm+AkEiduBGOPSuIPx5u5mGpvbpRRlZ5FExnkD7Dm6F B2kSHdKiP7iui5QQbMVX6kwqGUNuI2z5L5KE8TiBlbWiOd9Nc6gTB73YSaEAmXzTzHiD 7Yf6OZ5ECpzQU23Hhp+XLhyutkVV1gb3LI79lqFwejv/J36k0j0Et3bNmyfncdxcjzph GHfqi29T6TBsDaeI/C4YJjzKC7s0y/mi/MntlKDyuSWV4DX5pGJiQ2cdhG7dk50FvZPS FNMO9kW1M5aaQP1/sCbLqYgf74cVN03Hospw0ceGMruDYsZzC/3GOW79Ii7r4gRwVrWN 6+aA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724110249; x=1724715049; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jFCcMIWHME8JN9Gx6FSrjmoKIn8+i+WeAhjeL4V+qIE=; b=sS5Q28PnQrNMOF7jkzOicgrWnEzBom1uBNB5ieVIsps3+KLHBtO8FoeTbB0CV2T8zP d5aKHUdm4CpJs5fTmw3+FRMIyqNIy785qm5GK+sjS9fTV/svUvnW6Jc1Yy5ByJ4NgZhC bMEZp84qqxqi3DblYhRukekqReVdr+HoNRyhMvHtojk9S6xHlNSnB0GucVnxkiorWcs1 2+ZXMRxUmFH/CQx7soYpJ6ycjsDz6diw/TqEAD857MRRqhoyQkazZ1lUdtK0hxXGVK8z 65z5H+mBfaAG7ycktPaC0OTsZ00+ntj5NY6S/FbhraBhxmqs5Lr241B3yjsVxftGuQ1l 8fUw== X-Gm-Message-State: AOJu0Yy6Hy5V0qsq1fTnawoyotJuHAGpsV3YezOJZQ9QsFmVFuKaADfs WbwZQzAjyz3LjfGEBo9UwHyCuS2pGNKENxzAbHWBg8mrOpumqJGrz+qIpxvKOb4MlaBuBu0579b Q X-Google-Smtp-Source: AGHT+IEt3QogboPonSncMZrvZ0MCKYOJs6y4X44RDV4uIs2ZU9QAQGo+MmnStnx1VZKHChZAsuMQ5w== X-Received: by 2002:a05:6870:d609:b0:260:ccfd:1eff with SMTP id 586e51a60fabf-2701c575dc3mr6528241fac.7.1724110249109; Mon, 19 Aug 2024 16:30:49 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7c6b61dc929sm8219838a12.40.2024.08.19.16.30.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Aug 2024 16:30:48 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: dw@davidwei.uk, Jens Axboe Subject: [PATCH 2/5] io_uring: move schedule wait logic into helper Date: Mon, 19 Aug 2024 17:28:50 -0600 Message-ID: <20240819233042.230956-3-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240819233042.230956-1-axboe@kernel.dk> References: <20240819233042.230956-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In preparation for expanding how we handle waits, move the actual schedule and schedule_timeout() handling into a helper. Signed-off-by: Jens Axboe --- io_uring/io_uring.c | 37 +++++++++++++++++++++---------------- 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 37053d32c668..9e2b8d4c05db 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2350,22 +2350,10 @@ static bool current_pending_io(void) return percpu_counter_read_positive(&tctx->inflight); } -/* when returns >0, the caller should retry */ -static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, - struct io_wait_queue *iowq) +static int __io_cqring_wait_schedule(struct io_ring_ctx *ctx, + struct io_wait_queue *iowq) { - int ret; - - if (unlikely(READ_ONCE(ctx->check_cq))) - return 1; - if (unlikely(!llist_empty(&ctx->work_llist))) - return 1; - if (unlikely(test_thread_flag(TIF_NOTIFY_SIGNAL))) - return 1; - if (unlikely(task_sigpending(current))) - return -EINTR; - if (unlikely(io_should_wake(iowq))) - return 0; + int ret = 0; /* * Mark us as being in io_wait if we have pending requests, so cpufreq @@ -2374,7 +2362,6 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, */ if (current_pending_io()) current->in_iowait = 1; - ret = 0; if (iowq->timeout == KTIME_MAX) schedule(); else if (!schedule_hrtimeout_range_clock(&iowq->timeout, 0, @@ -2384,6 +2371,24 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, return ret; } +/* If this returns > 0, the caller should retry */ +static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, + struct io_wait_queue *iowq) +{ + if (unlikely(READ_ONCE(ctx->check_cq))) + return 1; + if (unlikely(!llist_empty(&ctx->work_llist))) + return 1; + if (unlikely(test_thread_flag(TIF_NOTIFY_SIGNAL))) + return 1; + if (unlikely(task_sigpending(current))) + return -EINTR; + if (unlikely(io_should_wake(iowq))) + return 0; + + return __io_cqring_wait_schedule(ctx, iowq); +} + struct ext_arg { size_t argsz; struct __kernel_timespec __user *ts; From patchwork Mon Aug 19 23:28:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13769175 Received: from mail-pg1-f169.google.com (mail-pg1-f169.google.com [209.85.215.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9AFBC1DF666 for ; Mon, 19 Aug 2024 23:30:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724110253; cv=none; b=KpwhMCCxDARV/YFgo0Dozbb05TEiX5L5oc9W9tj0VcrdbDmRNuksyI2gxM4xLsJJFedaW/oBw+83ZgyecSuQO9F1Vp1RD9o6Hsjd2u3qtMhCK4r5z3IW/Ikwj+0dIpJCYhdv6PgvOgs5G77FouWntMOkXcqGc811kHeZSVddKAY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724110253; c=relaxed/simple; bh=90pR2Ey96PSI1+J+Epa5/I/AV2VBjzGktD9NBso+DpM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=OzBtRyweayGDbg66xNyOSE/zNIk3qt2xnqdQR+f1emGA0ABdkuTCjZk06f7nL9zQaK0uhGEpqr3mZ6OinbOMKwyBq3fYIARBuf3M2NRWLGuc/g1Kcg4eIL4ZBRlOol8vI7xJx7yvJ94KHv2zjHh3wwMbBVfTPODjxNQ4lThmXzM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=xa32N3ew; arc=none smtp.client-ip=209.85.215.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="xa32N3ew" Received: by mail-pg1-f169.google.com with SMTP id 41be03b00d2f7-7a12bb0665eso329569a12.3 for ; Mon, 19 Aug 2024 16:30:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1724110250; x=1724715050; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tVKZdW1Aw/5ZOZerCJF/LQrFswF3tPTr8IaFuf4/pDo=; b=xa32N3ewe0eMVBpNYtaKkA5C3qyVAXyUZyDBc531xXQ6E1xDscw5rPcOfpkdaN12Sa sTJ7aNve5CRqEDb3kSEaglpJp+O0daO2If/AMxEp8Gidhtgm+ju7tUrVlZlQTctSny6T mG9gXmaa3jfHxitu0RFmJqKHrNPDJWvxdMTuUzmH0TZxDKh7tYMDDFHyuB9J6z8WJbgq SPgZhc3ndViFZB7tM5NUbbUGynHqzsZf/J73nuYx1AM4uQW0WwApEU4qDOFhll1ySVoJ NqwsIRoqUXquZP61LwBG6o7E2tizZPU0KFmkCTxa306XFiyHnJYNuteGN2IDmQI/Y9FX etKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724110250; x=1724715050; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tVKZdW1Aw/5ZOZerCJF/LQrFswF3tPTr8IaFuf4/pDo=; b=BOO8b0BE0YDjf9GDyLjHJqTSWh45zYK4L3+JK3kTvIRhExbhJ91B2V9uy3fAxMR7v0 3YtQyzxK0G0oEkKDpMaIEMDbJx4BnRjMOrDtKopnY6gzzcNfheYUZHyC04ZYlz003YAI xGFDSJotYiZB6hE4yx2Af6FxV49u/gG6R9nIrDtkomJX93VLxRqn1Vm2Cdq3oKQH2L9X cnSk//7Kvp7YAzEy5yPILTk1eQuxZPXMjNWPTcuhsGVOGsk/bnsDcP3ZgQvOCL8xNVMt gRtt67uUCLcIKO/4zfId5kVJXV9FZbl6kLTHROyD05TntPuN6I6yQRGwTvH2jDZmA0lt NxbQ== X-Gm-Message-State: AOJu0YxI2TEbsC5TDF28gj/zsgnadugt+wzH+chExduM4IkEp7SAeFFa fTnLzx5sTIaGZ5/ZDuugkDqb9BxP2alo9EiJdSu3z/1m7+i1i8X5xzLLhFT5rJEuCZEphaJgQLF c X-Google-Smtp-Source: AGHT+IFGpUShF0uMSTvctlrg4WiaSdepQbs/MkF+H4ZdNvJ/0SQ0tKT040f3z4ApeUYL33rLzd+txg== X-Received: by 2002:a05:6a00:6f15:b0:704:173c:5111 with SMTP id d2e1a72fcca58-713c5652357mr9068332b3a.3.1724110250336; Mon, 19 Aug 2024 16:30:50 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7c6b61dc929sm8219838a12.40.2024.08.19.16.30.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Aug 2024 16:30:49 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: dw@davidwei.uk, Jens Axboe Subject: [PATCH 3/5] io_uring: implement our own schedule timeout handling Date: Mon, 19 Aug 2024 17:28:51 -0600 Message-ID: <20240819233042.230956-4-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240819233042.230956-1-axboe@kernel.dk> References: <20240819233042.230956-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In preparation for having two distinct timeouts and avoid waking the task if we don't need to. Signed-off-by: Jens Axboe --- io_uring/io_uring.c | 41 ++++++++++++++++++++++++++++++++++++----- io_uring/io_uring.h | 2 ++ 2 files changed, 38 insertions(+), 5 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 9e2b8d4c05db..ddfbe04c61ed 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2322,7 +2322,7 @@ static int io_wake_function(struct wait_queue_entry *curr, unsigned int mode, * Cannot safely flush overflowed CQEs from here, ensure we wake up * the task, and the next invocation will do it. */ - if (io_should_wake(iowq) || io_has_work(iowq->ctx)) + if (io_should_wake(iowq) || io_has_work(iowq->ctx) || iowq->hit_timeout) return autoremove_wake_function(curr, mode, wake_flags, key); return -1; } @@ -2350,6 +2350,38 @@ static bool current_pending_io(void) return percpu_counter_read_positive(&tctx->inflight); } +static enum hrtimer_restart io_cqring_timer_wakeup(struct hrtimer *timer) +{ + struct io_wait_queue *iowq = container_of(timer, struct io_wait_queue, t); + struct io_ring_ctx *ctx = iowq->ctx; + + WRITE_ONCE(iowq->hit_timeout, 1); + if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) + wake_up_process(ctx->submitter_task); + else + io_cqring_wake(ctx); + return HRTIMER_NORESTART; +} + +static int io_cqring_schedule_timeout(struct io_wait_queue *iowq, + clockid_t clock_id) +{ + iowq->hit_timeout = 0; + hrtimer_init_on_stack(&iowq->t, clock_id, HRTIMER_MODE_ABS); + iowq->t.function = io_cqring_timer_wakeup; + hrtimer_set_expires_range_ns(&iowq->t, iowq->timeout, 0); + hrtimer_start_expires(&iowq->t, HRTIMER_MODE_ABS); + + if (!READ_ONCE(iowq->hit_timeout)) + schedule(); + + hrtimer_cancel(&iowq->t); + destroy_hrtimer_on_stack(&iowq->t); + __set_current_state(TASK_RUNNING); + + return READ_ONCE(iowq->hit_timeout) ? -ETIME : 0; +} + static int __io_cqring_wait_schedule(struct io_ring_ctx *ctx, struct io_wait_queue *iowq) { @@ -2362,11 +2394,10 @@ static int __io_cqring_wait_schedule(struct io_ring_ctx *ctx, */ if (current_pending_io()) current->in_iowait = 1; - if (iowq->timeout == KTIME_MAX) + if (iowq->timeout != KTIME_MAX) + ret = io_cqring_schedule_timeout(iowq, ctx->clockid); + else schedule(); - else if (!schedule_hrtimeout_range_clock(&iowq->timeout, 0, - HRTIMER_MODE_ABS, ctx->clockid)) - ret = -ETIME; current->in_iowait = 0; return ret; } diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index 9935819f12b7..f95c1b080f4b 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -40,7 +40,9 @@ struct io_wait_queue { struct io_ring_ctx *ctx; unsigned cq_tail; unsigned nr_timeouts; + int hit_timeout; ktime_t timeout; + struct hrtimer t; #ifdef CONFIG_NET_RX_BUSY_POLL ktime_t napi_busy_poll_dt; From patchwork Mon Aug 19 23:28:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13769176 Received: from mail-pg1-f181.google.com (mail-pg1-f181.google.com [209.85.215.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3DA011DF69F for ; Mon, 19 Aug 2024 23:30:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724110255; cv=none; b=CELa5LUsGrLzgW+GuXJruMpdfSL323xN+H1A4bpCwtq+bQzSoch6FNw4L0Xtag8BjTUa94I4Ix/aN5CCurcvQomxCA9IgERQYenJ5rFko6ENVva3hN6C4jKvV1zPUS/uSdIakYWDQevQaUqnie7jKJNVXEN+OUC4UigFhBMH3HA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724110255; c=relaxed/simple; bh=Lb7j/kLMs8jWsEG+Y7Us0Q/dAQ4cqFm/VRlrOvHmKgs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FIyZYu/tW9J6Ky0XP/TumAEoa9BnT1Gsx9X6WhcSSNKyy6V+8ouhH6Z1Sy1BXfNURDQIvz5Q9TBb+7Srq2QhxzZXpuq8YNFvUCP0Fa/y80o14ORpHA90Swnnel3dX/9o6lM1YdLgxRWO+RPTmrCUsBb7KV8EVXMLI/CoJWTrpKQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=2wt59RQP; arc=none smtp.client-ip=209.85.215.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="2wt59RQP" Received: by mail-pg1-f181.google.com with SMTP id 41be03b00d2f7-78f86e56b4cso429687a12.3 for ; Mon, 19 Aug 2024 16:30:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1724110252; x=1724715052; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xjVRvH7/wAQ3u0PxEu9L4on+dtOhqN39XwqqSWu2Nic=; b=2wt59RQPTSv4WX5K1eOUZURsqe3KL7Nokkkj2b0TF4NQqoyTMgcHuqsBqxM+j0i/j/ KyK2CmK88GOd+U5tjQrGOW0h17HELMZvK3VvLiFtQ+30ikLWtlok8bk3zYiylqd8K9cB Ka6Hbec8aJMtGUPlUnRuKiytmZSB4b4tFeZPrL2b+hnerzO82querinQ/Sg2fKiYhpwI K10UDALSlo3mCCbU3woody5SXJhwfal3/EudCFHBf7O/Me0mOpWoCohuG9BOrL5mON6Z nE6fLNCxAsrQnIzqPPS9xCeVt6Szxq4PIO3jeXSxcxou4aKQDffUKRdVNwynLnwdWiWV 3nJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724110252; x=1724715052; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xjVRvH7/wAQ3u0PxEu9L4on+dtOhqN39XwqqSWu2Nic=; b=BFu2FoB8qwWu50GSMaO6l8DRyJQtU2t9DtIoCG2SG6AimEo6u9e1KN88uUmqVat0Au b6bNOcQ1p44mFZgZBQnO/ZLZibkJLvanKL9PepEacj3sExXVUBxUt7/jNT032dWGXHNt D6mWhg/M08G+vhJCf6sGu4uKJbM8WS7lsqfhEITj8+wN/r7aaTx0iC4Ey/xbDJxHT4jD WE7XaT4IyIKgOnTSuzGjeT/L0To2ayuNLYibwEjJZlMwsedLFjsOwvtdd19/LUIURz2h fwSzUnLGnUAUCjOYCsv+FicL+hZZeIYM9Vq6PGNLOziKZzVblZwDQaNU6h5MS3gt0BpW sObQ== X-Gm-Message-State: AOJu0Yx3dW0mqRHJZ/ErwkGDqvqzBLtDVR0ikD4rQ/GGmIwnrr3kfOvD CXH8BfIVlB1/mm3NBjvHrQOXwW4qFEQ3MAdIDnHzxwsrjCerwJkTL+PuUDWBFtyPNi1y3Ve4vj8 V X-Google-Smtp-Source: AGHT+IGfyfEuDt7zvo3cFrCcaO0WBJONcuQnL3ZmZbdb/rAHyRAC5sYqECoijY8BJKWlgQcwynfgsg== X-Received: by 2002:a05:6a21:9982:b0:1c4:e645:559b with SMTP id adf61e73a8af0-1c9050929f0mr10679507637.8.1724110251920; Mon, 19 Aug 2024 16:30:51 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7c6b61dc929sm8219838a12.40.2024.08.19.16.30.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Aug 2024 16:30:50 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: dw@davidwei.uk, Jens Axboe Subject: [PATCH 4/5] io_uring: add support for batch wait timeout Date: Mon, 19 Aug 2024 17:28:52 -0600 Message-ID: <20240819233042.230956-5-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240819233042.230956-1-axboe@kernel.dk> References: <20240819233042.230956-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Waiting for events with io_uring has two knobs that can be set: 1) The number of events to wake for 2) The timeout associated with the event Waiting will abort when either of those conditions are met, as expected. This adds support for a third event, which is associated with the number of events to wait for. Applications generally like to handle batches of completions, and right now they'd set a number of events to wait for and the timeout for that. If no events have been received but the timeout triggers, control is returned to the application and it can wait again. However, if the application doesn't have anything to do until events are reaped, then it's possible to make this waiting more efficient. For example, the application may have a latency time of 50 usecs and wanting to handle a batch of 8 requests at the time. If it uses 50 usecs as the timeout, then it'll be doing 20K context switches per second even if nothing is happening. This introduces the notion of min batch wait time. If the min batch wait time expires, then we'll return to userspace if we have any events at all. If none are available, the general wait time is applied. Any request arriving after the min batch wait time will cause waiting to stop and return control to the application. Signed-off-by: Jens Axboe --- io_uring/io_uring.c | 75 +++++++++++++++++++++++++++++++++++++++------ io_uring/io_uring.h | 2 ++ 2 files changed, 67 insertions(+), 10 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index ddfbe04c61ed..d09a7c2e1096 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2363,13 +2363,62 @@ static enum hrtimer_restart io_cqring_timer_wakeup(struct hrtimer *timer) return HRTIMER_NORESTART; } +/* + * Doing min_timeout portion. If we saw any timeouts, events, or have work, + * wake up. If not, and we have a normal timeout, switch to that and keep + * sleeping. + */ +static enum hrtimer_restart io_cqring_min_timer_wakeup(struct hrtimer *timer) +{ + struct io_wait_queue *iowq = container_of(timer, struct io_wait_queue, t); + struct io_ring_ctx *ctx = iowq->ctx; + + /* no general timeout, or shorter, we are done */ + if (iowq->timeout == KTIME_MAX || + ktime_after(iowq->min_timeout, iowq->timeout)) + goto out_wake; + /* work we may need to run, wake function will see if we need to wake */ + if (io_has_work(ctx)) + goto out_wake; + /* got events since we started waiting, min timeout is done */ + if (iowq->cq_min_tail != READ_ONCE(ctx->rings->cq.tail)) + goto out_wake; + /* if we have any events and min timeout expired, we're done */ + if (io_cqring_events(ctx)) + goto out_wake; + + /* + * If using deferred task_work running and application is waiting on + * more than one request, ensure we reset it now where we are switching + * to normal sleeps. Any request completion post min_wait should wake + * the task and return. + */ + if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) + atomic_set(&ctx->cq_wait_nr, 1); + + iowq->t.function = io_cqring_timer_wakeup; + hrtimer_set_expires(timer, iowq->timeout); + return HRTIMER_RESTART; +out_wake: + return io_cqring_timer_wakeup(timer); +} + static int io_cqring_schedule_timeout(struct io_wait_queue *iowq, - clockid_t clock_id) + clockid_t clock_id, ktime_t start_time) { + ktime_t timeout; + iowq->hit_timeout = 0; hrtimer_init_on_stack(&iowq->t, clock_id, HRTIMER_MODE_ABS); - iowq->t.function = io_cqring_timer_wakeup; - hrtimer_set_expires_range_ns(&iowq->t, iowq->timeout, 0); + if (iowq->min_timeout) { + timeout = ktime_add_ns(iowq->min_timeout, start_time); + iowq->t.function = io_cqring_min_timer_wakeup; + } else { + timeout = iowq->timeout; + iowq->t.function = io_cqring_timer_wakeup; + } + + hrtimer_set_expires_range_ns(&iowq->t, timeout, 0); hrtimer_start_expires(&iowq->t, HRTIMER_MODE_ABS); if (!READ_ONCE(iowq->hit_timeout)) @@ -2383,7 +2432,8 @@ static int io_cqring_schedule_timeout(struct io_wait_queue *iowq, } static int __io_cqring_wait_schedule(struct io_ring_ctx *ctx, - struct io_wait_queue *iowq) + struct io_wait_queue *iowq, + ktime_t start_time) { int ret = 0; @@ -2394,8 +2444,8 @@ static int __io_cqring_wait_schedule(struct io_ring_ctx *ctx, */ if (current_pending_io()) current->in_iowait = 1; - if (iowq->timeout != KTIME_MAX) - ret = io_cqring_schedule_timeout(iowq, ctx->clockid); + if (iowq->timeout != KTIME_MAX || iowq->min_timeout != KTIME_MAX) + ret = io_cqring_schedule_timeout(iowq, ctx->clockid, start_time); else schedule(); current->in_iowait = 0; @@ -2404,7 +2454,8 @@ static int __io_cqring_wait_schedule(struct io_ring_ctx *ctx, /* If this returns > 0, the caller should retry */ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, - struct io_wait_queue *iowq) + struct io_wait_queue *iowq, + ktime_t start_time) { if (unlikely(READ_ONCE(ctx->check_cq))) return 1; @@ -2417,7 +2468,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, if (unlikely(io_should_wake(iowq))) return 0; - return __io_cqring_wait_schedule(ctx, iowq); + return __io_cqring_wait_schedule(ctx, iowq, start_time); } struct ext_arg { @@ -2435,6 +2486,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, { struct io_wait_queue iowq; struct io_rings *rings = ctx->rings; + ktime_t start_time; int ret; if (!io_allowed_run_tw(ctx)) @@ -2453,8 +2505,11 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, INIT_LIST_HEAD(&iowq.wq.entry); iowq.ctx = ctx; iowq.nr_timeouts = atomic_read(&ctx->cq_timeouts); + iowq.cq_min_tail = READ_ONCE(ctx->rings->cq.tail); iowq.cq_tail = READ_ONCE(ctx->rings->cq.head) + min_events; + iowq.min_timeout = 0; iowq.timeout = KTIME_MAX; + start_time = io_get_time(ctx); if (ext_arg->ts) { struct timespec64 ts; @@ -2464,7 +2519,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, iowq.timeout = timespec64_to_ktime(ts); if (!(flags & IORING_ENTER_ABS_TIMER)) - iowq.timeout = ktime_add(iowq.timeout, io_get_time(ctx)); + iowq.timeout = ktime_add(iowq.timeout, start_time); } if (ext_arg->sig) { @@ -2495,7 +2550,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, TASK_INTERRUPTIBLE); } - ret = io_cqring_wait_schedule(ctx, &iowq); + ret = io_cqring_wait_schedule(ctx, &iowq, start_time); __set_current_state(TASK_RUNNING); atomic_set(&ctx->cq_wait_nr, IO_CQ_WAKE_INIT); diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index f95c1b080f4b..65078e641390 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -39,8 +39,10 @@ struct io_wait_queue { struct wait_queue_entry wq; struct io_ring_ctx *ctx; unsigned cq_tail; + unsigned cq_min_tail; unsigned nr_timeouts; int hit_timeout; + ktime_t min_timeout; ktime_t timeout; struct hrtimer t; From patchwork Mon Aug 19 23:28:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13769177 Received: from mail-pj1-f54.google.com (mail-pj1-f54.google.com [209.85.216.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C3A815C613 for ; Mon, 19 Aug 2024 23:30:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724110256; cv=none; b=W8qH5+Y6PIf3vmKp31mA9iRmXXlKe0ugBLUsft5C+vZ/pI+XkHC74RseR8lR6IUCmkdM3vc6/ylTxeJ0ZH1Y2szV4dBPCV1+ZcxOzq6W8WTx6NC5VdcYHnMbccGMo800rJIa4N7cXPJxwr5oILH/GgRYfe/7HVeyMEwmRx+h3xk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724110256; c=relaxed/simple; bh=1AMvaUwiKqJEPa0Ol9keO8S69ng/qRx7RpmzDx/vtNU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Hqtez4DjcgL9NNSskiw4exHAdRwW0D+ykBkhOOtV3hgMo8NVe/Ovk2m+TgfkhxvFzXOPOqkTgLQTd2A84njmWE8WQgeo4aXSEmMe8yXURMiZye0zAauXg8Bw/AQU+fcLen04w5mYDLC82/GAVsmEKd6T4Qdmwf8x+y0B8ZE7JUg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=RbCoAz8s; arc=none smtp.client-ip=209.85.216.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="RbCoAz8s" Received: by mail-pj1-f54.google.com with SMTP id 98e67ed59e1d1-2d3d382de43so837083a91.0 for ; Mon, 19 Aug 2024 16:30:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1724110253; x=1724715053; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zvatlSeRar1mDUQg4xOKTeFuqZnG3ELf9pznvltnYEg=; b=RbCoAz8sTboDsi8aaD6p5/EMybVG8em6QqOLey2H4MmJhuv5WETbzpPvU2wyqXNxtu xzOI9jYnGqabzFx+qqkWh43KqGvLtPo+bMRrv+Jq6bi4o7fT/xfzpEe3XglJkxsziIQi N4pfSXTJ+E9gUgDGYgE3rT/PyZ7xzZ/9UOpI3AWWbyvppe24oe+w2i3WcPh5oL6GwJ59 B1ezeMkltBmiWL/qSck6dV+mdJkcSS0qjxYaAF+n7nWT+p88Sm6gYDwR1v/Keh6PJtHZ qGhYozyxlEVY/HVOw+6LH2TAqi8ckfdL5yvF8wZpAdy8Cj1fSA3plIRHL7PoC9whvOaJ lbmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724110253; x=1724715053; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zvatlSeRar1mDUQg4xOKTeFuqZnG3ELf9pznvltnYEg=; b=q8SUcdk9KvIBLeezvwulYPki0nheKSAbRdit6yIjnGJnuIDb7ipqHEpL8W3cGUJ+E0 4ZrI50Tx0pVUTm0UWRP60v1tcpjos5dvpcEOrHW6Hakam1DQIIte1DsVsoIFoKE90JrB aLmz85pa+b/+fHXaZWMkELoCMLyefDEP+3ostyJMnNt8KLMPloEptnWcm8RmkAR+YFB3 LtHZo04JONGpliG6dFqjX8DI0Oc6nbRQTq08OLhaNowCRRIyQYT9LQMHFp/pOd1+9WjE G9vlDI/F69aCvuaaQ82C424riMAJdyhbFlth/uLpgd82ldTS+BwuXSe+uw5i8W/KKjHh cFIA== X-Gm-Message-State: AOJu0YwtLSZ3CfNd98MsJxAwivQ85ZaHejh7xxeptlPd1WZzJFZJ2FsH qRr6v1SoXbl4lishyfM+KxyTBnVQb7fhG8r/kaTP5+GBX3gPBZrK5LEBkXcg6Iag/CdMRGOImKt O X-Google-Smtp-Source: AGHT+IERNCchdBEtWEoWu4cf299u8sCDIiHath6hnhxsAjy5cjv6QKZyAQQnwr83RqUSAsL0XXFLRQ== X-Received: by 2002:a05:6a21:3381:b0:1c4:d312:64d8 with SMTP id adf61e73a8af0-1c904f81da7mr8024689637.3.1724110253454; Mon, 19 Aug 2024 16:30:53 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7c6b61dc929sm8219838a12.40.2024.08.19.16.30.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Aug 2024 16:30:52 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: dw@davidwei.uk, Jens Axboe Subject: [PATCH 5/5] io_uring: wire up min batch wake timeout Date: Mon, 19 Aug 2024 17:28:53 -0600 Message-ID: <20240819233042.230956-6-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240819233042.230956-1-axboe@kernel.dk> References: <20240819233042.230956-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Expose min_wait_usec in io_uring_getevents_arg, replacing the pad member that is currently in there. The value is in usecs, which is explained in the name as well. Note that if min_wait_usec and a normal timeout is used in conjunction, the normal timeout is still relative to the base time. For example, if min_wait_usec is set to 100 and the normal timeout is 1000, the max total time waited is still 1000. This also means that if the normal timeout is shorter than min_wait_usec, then only the min_wait_usec will take effect. See previous commit for an explanation of how this works. IORING_FEAT_MIN_TIMEOUT is added as a feature flag for this, as applications doing submit_and_wait_timeout() style operations will generally not see the -EINVAL from the wait side as they return the number of IOs submitted. Only if no IOs are submitted will the -EINVAL bubble back up to the application. Signed-off-by: Jens Axboe --- include/uapi/linux/io_uring.h | 3 ++- io_uring/io_uring.c | 8 ++++---- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 7af716136df9..042eab793e26 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -543,6 +543,7 @@ struct io_uring_params { #define IORING_FEAT_LINKED_FILE (1U << 12) #define IORING_FEAT_REG_REG_RING (1U << 13) #define IORING_FEAT_RECVSEND_BUNDLE (1U << 14) +#define IORING_FEAT_MIN_TIMEOUT (1U << 15) /* * io_uring_register(2) opcodes and arguments @@ -766,7 +767,7 @@ enum io_uring_register_restriction_op { struct io_uring_getevents_arg { __u64 sigmask; __u32 sigmask_sz; - __u32 pad; + __u32 min_wait_usec; __u64 ts; }; diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index d09a7c2e1096..20b67fea645d 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2475,6 +2475,7 @@ struct ext_arg { size_t argsz; struct __kernel_timespec __user *ts; const sigset_t __user *sig; + ktime_t min_time; }; /* @@ -2507,7 +2508,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, iowq.nr_timeouts = atomic_read(&ctx->cq_timeouts); iowq.cq_min_tail = READ_ONCE(ctx->rings->cq.tail); iowq.cq_tail = READ_ONCE(ctx->rings->cq.head) + min_events; - iowq.min_timeout = 0; + iowq.min_timeout = ext_arg->min_time; iowq.timeout = KTIME_MAX; start_time = io_get_time(ctx); @@ -3231,8 +3232,7 @@ static int io_get_ext_arg(unsigned flags, const void __user *argp, return -EINVAL; if (copy_from_user(&arg, argp, sizeof(arg))) return -EFAULT; - if (arg.pad) - return -EINVAL; + ext_arg->min_time = arg.min_wait_usec * NSEC_PER_USEC; ext_arg->sig = u64_to_user_ptr(arg.sigmask); ext_arg->argsz = arg.sigmask_sz; ext_arg->ts = u64_to_user_ptr(arg.ts); @@ -3633,7 +3633,7 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p, IORING_FEAT_EXT_ARG | IORING_FEAT_NATIVE_WORKERS | IORING_FEAT_RSRC_TAGS | IORING_FEAT_CQE_SKIP | IORING_FEAT_LINKED_FILE | IORING_FEAT_REG_REG_RING | - IORING_FEAT_RECVSEND_BUNDLE; + IORING_FEAT_RECVSEND_BUNDLE | IORING_FEAT_MIN_TIMEOUT; if (copy_to_user(params, p, sizeof(*p))) { ret = -EFAULT;