From patchwork Wed Aug 21 14:16:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13771658 Received: from mail-il1-f171.google.com (mail-il1-f171.google.com [209.85.166.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8A4B21AF4D2 for ; Wed, 21 Aug 2024 14:19:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724249958; cv=none; b=H1HePLsD+yKmVTW0NZFhqnQHPrLVBW4HKSsDw2OSWvJUjawNvN8V9NUowuzgHy28HiLd2mep9O39uuaQ0qsPFfttq1FuKVb3ufSjdpubZLFCr7BIiV0IKlA7QqkXxEgmoJeRJqg4cZWufhbgmVLYhw/dbC3xUrNegISJP0M1yAY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724249958; c=relaxed/simple; bh=xMoc4My7QZ8jZ5RQ5B8Hxyx6CrgWlRJp/Q9mFmCwZGs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ceh6nq8WUMl7s4/tEHnPgLILE5jaQm3CsTapiCIcPhFI7siNUGsUDQkmH8wy2sJI9QSQp+oKTesOO9b/frgkdNgJtfItkO6E5ocGg0RWjk69tB2ofboGHL74boT7Mfoaw/siOZUD2M53fEnANgMmtbzugxAegGhxwXSKfDAWtQY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=bUIWAngH; arc=none smtp.client-ip=209.85.166.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="bUIWAngH" Received: by mail-il1-f171.google.com with SMTP id e9e14a558f8ab-39d3cd4fa49so3025395ab.1 for ; Wed, 21 Aug 2024 07:19:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1724249953; x=1724854753; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=A6LTIh3DcVfrLOVFkRBBopt0cmnN3QyNN9FFhzOKYYo=; b=bUIWAngHGY0+adtTvpg0QQLsZfKdZD3u7EJl+4os4QJvFJqGvd2S16YscpVA6AGo5j UCto7bOHZg/Eeoggi84cxZouSEZuTRmjBwhol7I2nBNIH8RmMQ2RXpQJbvLWoWMPLmDk WVWyJCIRYX4aKY2nJ2RHJs1Kxz/yXSE+sMT/cKXxvcxq3fYVR4oUBYFkbIWX0Lqef2No iSA0BxFs0c8rwkZ5Q3Ybpu2220UM6WnLNqZ7iwlmPOY/L2VaDFWU1Ev/E9QyTxZZDiYL Ws2nRBBvWXLip5UbU+2akpBppDIX9ySbFFKvhHISNa7KIJ2xK7nA4zgkJkNmqyqKpDIR mBSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724249953; x=1724854753; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=A6LTIh3DcVfrLOVFkRBBopt0cmnN3QyNN9FFhzOKYYo=; b=TUmmoik5ROgKQQKY6dI2NkipSEYECxph96N0f6iJh7/NV03eUg8P9xN0GlufMykSS/ h7j/anS+XzouBVAnutC1yuEiQTz7rLg7z24muZAJ6A/5TUaLT7g4+3m89ANjJPwXav6c +66v9aQiGwPx8PCrdmPuhwJSx7fzCBx2vZ/FMtfRin/YZdf0/jc42Ah10OlPw4Q1VJe1 z7lgAmd+tMIQ7fX4IrbLNHOJX0Tw5EhRjg88RwHyLCfPArSFXm5r+jGJWDYxUI/tuPoH WsxB4CqaYMbMQrqSn7PldzPEgAAWU1Rq1gRNyADO3T1tdYsMUFa5cc7UCyzWLZC2NRYu Xz8w== X-Gm-Message-State: AOJu0Yx2oGssVq2hYp1Me95D8BbWHqNlghlysiPsy0jHK7S8p1w1Wvr+ bGhd9O8E9eluvKgstdgLr1qEHyk1atFo3WJen8JIMWsKGHl91EZ6EVtLZmTdRVj/4QB46BnfLnI m X-Google-Smtp-Source: AGHT+IHYoxO7kjfbBsDXbVUJLAGr5MmQ4Afq5NyUBt5LDoVdknrp46HPripSzP2q2zbjiy7addY1QQ== X-Received: by 2002:a05:6e02:152b:b0:39d:2e6f:bb19 with SMTP id e9e14a558f8ab-39d6c3c0e6cmr18502195ab.10.1724249953467; Wed, 21 Aug 2024 07:19:13 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id e9e14a558f8ab-39d1eb0bc93sm50967285ab.19.2024.08.21.07.19.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Aug 2024 07:19:13 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: dw@davidwei.uk, Jens Axboe Subject: [PATCH 1/5] io_uring: encapsulate extraneous wait flags into a separate struct Date: Wed, 21 Aug 2024 08:16:22 -0600 Message-ID: <20240821141910.204660-2-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240821141910.204660-1-axboe@kernel.dk> References: <20240821141910.204660-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Rather than need to pass in 2 or 3 separate arguments, add a struct to encapsulate the timeout and sigset_t parts of waiting. In preparation for adding another argument for waiting. Signed-off-by: Jens Axboe --- io_uring/io_uring.c | 45 ++++++++++++++++++++++++--------------------- 1 file changed, 24 insertions(+), 21 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 20229e72b65c..37053d32c668 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2384,13 +2384,18 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, return ret; } +struct ext_arg { + size_t argsz; + struct __kernel_timespec __user *ts; + const sigset_t __user *sig; +}; + /* * Wait until events become available, if we don't already have some. The * application must reap them itself, as they reside on the shared cq ring. */ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, - const sigset_t __user *sig, size_t sigsz, - struct __kernel_timespec __user *uts) + struct ext_arg *ext_arg) { struct io_wait_queue iowq; struct io_rings *rings = ctx->rings; @@ -2415,10 +2420,10 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, iowq.cq_tail = READ_ONCE(ctx->rings->cq.head) + min_events; iowq.timeout = KTIME_MAX; - if (uts) { + if (ext_arg->ts) { struct timespec64 ts; - if (get_timespec64(&ts, uts)) + if (get_timespec64(&ts, ext_arg->ts)) return -EFAULT; iowq.timeout = timespec64_to_ktime(ts); @@ -2426,14 +2431,14 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, iowq.timeout = ktime_add(iowq.timeout, io_get_time(ctx)); } - if (sig) { + if (ext_arg->sig) { #ifdef CONFIG_COMPAT if (in_compat_syscall()) - ret = set_compat_user_sigmask((const compat_sigset_t __user *)sig, - sigsz); + ret = set_compat_user_sigmask((const compat_sigset_t __user *)ext_arg->sig, + ext_arg->argsz); else #endif - ret = set_user_sigmask(sig, sigsz); + ret = set_user_sigmask(ext_arg->sig, ext_arg->argsz); if (ret) return ret; @@ -3112,9 +3117,8 @@ static int io_validate_ext_arg(unsigned flags, const void __user *argp, size_t a return 0; } -static int io_get_ext_arg(unsigned flags, const void __user *argp, size_t *argsz, - struct __kernel_timespec __user **ts, - const sigset_t __user **sig) +static int io_get_ext_arg(unsigned flags, const void __user *argp, + struct ext_arg *ext_arg) { struct io_uring_getevents_arg arg; @@ -3123,8 +3127,8 @@ static int io_get_ext_arg(unsigned flags, const void __user *argp, size_t *argsz * is just a pointer to the sigset_t. */ if (!(flags & IORING_ENTER_EXT_ARG)) { - *sig = (const sigset_t __user *) argp; - *ts = NULL; + ext_arg->sig = (const sigset_t __user *) argp; + ext_arg->ts = NULL; return 0; } @@ -3132,15 +3136,15 @@ static int io_get_ext_arg(unsigned flags, const void __user *argp, size_t *argsz * EXT_ARG is set - ensure we agree on the size of it and copy in our * timespec and sigset_t pointers if good. */ - if (*argsz != sizeof(arg)) + if (ext_arg->argsz != sizeof(arg)) return -EINVAL; if (copy_from_user(&arg, argp, sizeof(arg))) return -EFAULT; if (arg.pad) return -EINVAL; - *sig = u64_to_user_ptr(arg.sigmask); - *argsz = arg.sigmask_sz; - *ts = u64_to_user_ptr(arg.ts); + ext_arg->sig = u64_to_user_ptr(arg.sigmask); + ext_arg->argsz = arg.sigmask_sz; + ext_arg->ts = u64_to_user_ptr(arg.ts); return 0; } @@ -3246,15 +3250,14 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit, } mutex_unlock(&ctx->uring_lock); } else { - const sigset_t __user *sig; - struct __kernel_timespec __user *ts; + struct ext_arg ext_arg = { .argsz = argsz }; - ret2 = io_get_ext_arg(flags, argp, &argsz, &ts, &sig); + ret2 = io_get_ext_arg(flags, argp, &ext_arg); if (likely(!ret2)) { min_complete = min(min_complete, ctx->cq_entries); ret2 = io_cqring_wait(ctx, min_complete, flags, - sig, argsz, ts); + &ext_arg); } } From patchwork Wed Aug 21 14:16:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13771657 Received: from mail-io1-f43.google.com (mail-io1-f43.google.com [209.85.166.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B12911AF4ED for ; Wed, 21 Aug 2024 14:19:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724249957; cv=none; b=HBV6kauUO5B1BiwYrjlMWrBZkbMVYOxpb7ZjoFm3ZHfbw2FX2sRX1KA8YHvFqjdOX/whrbRw8YfgoGTflwkNCuu1I7Cxrga7+8uXTC8zLg7gej6BrGalLfwN4iGhWFoALFOqw1yxClOnRdcBf7HAL5iDUVGkIOBFZ1WwBxM62WE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724249957; c=relaxed/simple; bh=VQY2AbWmYzmJa+BvEd4NBvoFPgkwUcD+DdlnfHAS3AA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JX/yKF5QCLcjt4zIVJIczDgrzA1SKJs6L1n7PJADAH3Wd8O0BrbYx9tby2d8B3Woa2JK4keUjjoERDlRdZF2WylwIKSpCooXcDjbQx7sizQhdkOtkR1TPotsJj9Dnid3Caz2T9o6MNQI8EyJ6NdYeOAGvNK7wamRjK3x0qT0Mr0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=rOyZGsGV; arc=none smtp.client-ip=209.85.166.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="rOyZGsGV" Received: by mail-io1-f43.google.com with SMTP id ca18e2360f4ac-81fd1e1d38bso365100339f.1 for ; Wed, 21 Aug 2024 07:19:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1724249954; x=1724854754; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jFCcMIWHME8JN9Gx6FSrjmoKIn8+i+WeAhjeL4V+qIE=; b=rOyZGsGVHcx2wSdBdu7jJ2Az/g2V42JK9LgM8/BnBoyz44v3BKeWwSjVmZziNnikv6 64RYHFLNkT3XFcDOfKpPzM3X29NCX/L6devztElf8a2k01iGgPbL5WWMOn31fNL363dq gElWzje1bG7DE9o8UnVTi5H9ZBX9sUR871z3FneCxWryRaY3D9kUSvArppW2wV4yS9NT +A5vTsPhek2Y4nyYcbidlaRoYftlQ+IzC1Qs2GefrdR4XPu+LAyQGKktz4nNDDuzBt20 ltg17b/uKu5bcljISuZOizooVCIU83Ca3IyUe7PSMLOVlICJYx8lvgFN1CjBtN4E1M8W 2u9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724249954; x=1724854754; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jFCcMIWHME8JN9Gx6FSrjmoKIn8+i+WeAhjeL4V+qIE=; b=aOqGWX4ba2UlDremZL5qfgd7KDOcXcO0rGT+8r0sHGBav/qfFLRFyLXG9E8ru8i5pg d4gL0pD7bIL29N9VEG5cebvWe4B88Gszi0WNAjH5wVM/eRZ6//0u/X6kSqczEs8rEVPE jPQTDVzjp42XaCCXNSTGuPwcYdQLK+hK1ReyXRyUv7ywffscnYl6Bt1c+fNG8q/Mepkt wdcFdp3JhzoL/Q0UDPbcdVNVI5OWzmGXpggCwFD0BcPby62BYKbfW0VcFzH2ZrEjxSSv 88WR7QtBPnaO1Qs1F+Gvc5L3qIVNUpZRiBGjxz02ZnUI2wPEKyCEQWydctJyJJe7SMi8 hq9A== X-Gm-Message-State: AOJu0Yzua16XFUt8eE3EWzp7oBqopSKjRENpv4PBeot5LJ5j2i0WTak+ 3QmpGUzOWFHaVATPSR9CtMiaxAB55OXxEt80J0ML0XNVmFgQG7rs6T/F1wiHNTIaLDVERRHwht9 7 X-Google-Smtp-Source: AGHT+IG9psSgP1tmj+j1eU+szjm1hn/x7grDMXlbo+8xvA1EJg02Kru+3oSryJTspV+znRglW7pfDA== X-Received: by 2002:a05:6e02:1a8e:b0:39d:35f2:6ed7 with SMTP id e9e14a558f8ab-39d6c3bc3bemr26200055ab.27.1724249954320; Wed, 21 Aug 2024 07:19:14 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id e9e14a558f8ab-39d1eb0bc93sm50967285ab.19.2024.08.21.07.19.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Aug 2024 07:19:13 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: dw@davidwei.uk, Jens Axboe Subject: [PATCH 2/5] io_uring: move schedule wait logic into helper Date: Wed, 21 Aug 2024 08:16:23 -0600 Message-ID: <20240821141910.204660-3-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240821141910.204660-1-axboe@kernel.dk> References: <20240821141910.204660-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In preparation for expanding how we handle waits, move the actual schedule and schedule_timeout() handling into a helper. Signed-off-by: Jens Axboe --- io_uring/io_uring.c | 37 +++++++++++++++++++++---------------- 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 37053d32c668..9e2b8d4c05db 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2350,22 +2350,10 @@ static bool current_pending_io(void) return percpu_counter_read_positive(&tctx->inflight); } -/* when returns >0, the caller should retry */ -static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, - struct io_wait_queue *iowq) +static int __io_cqring_wait_schedule(struct io_ring_ctx *ctx, + struct io_wait_queue *iowq) { - int ret; - - if (unlikely(READ_ONCE(ctx->check_cq))) - return 1; - if (unlikely(!llist_empty(&ctx->work_llist))) - return 1; - if (unlikely(test_thread_flag(TIF_NOTIFY_SIGNAL))) - return 1; - if (unlikely(task_sigpending(current))) - return -EINTR; - if (unlikely(io_should_wake(iowq))) - return 0; + int ret = 0; /* * Mark us as being in io_wait if we have pending requests, so cpufreq @@ -2374,7 +2362,6 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, */ if (current_pending_io()) current->in_iowait = 1; - ret = 0; if (iowq->timeout == KTIME_MAX) schedule(); else if (!schedule_hrtimeout_range_clock(&iowq->timeout, 0, @@ -2384,6 +2371,24 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, return ret; } +/* If this returns > 0, the caller should retry */ +static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, + struct io_wait_queue *iowq) +{ + if (unlikely(READ_ONCE(ctx->check_cq))) + return 1; + if (unlikely(!llist_empty(&ctx->work_llist))) + return 1; + if (unlikely(test_thread_flag(TIF_NOTIFY_SIGNAL))) + return 1; + if (unlikely(task_sigpending(current))) + return -EINTR; + if (unlikely(io_should_wake(iowq))) + return 0; + + return __io_cqring_wait_schedule(ctx, iowq); +} + struct ext_arg { size_t argsz; struct __kernel_timespec __user *ts; From patchwork Wed Aug 21 14:16:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13771659 Received: from mail-il1-f177.google.com (mail-il1-f177.google.com [209.85.166.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9EC861B1D56 for ; Wed, 21 Aug 2024 14:19:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724249959; cv=none; b=Fk8lHl1ejogzoCMom+j2yjCcjsgmCVgjQkG82I950EFer9nookk6qmUqLY7yw/2oHtG/yzuygKF1ntgdnGZyzog3j0B7W7VtvpZcDk2H8FVO4wyb1ezQTU0c8Tt6pUc4Vo7fYtgl48ek0fX5t060RMPj5S9DLTAKK1/xX/dKJnM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724249959; c=relaxed/simple; bh=ZuKjf4ATUzMHz+NDRSe13wyM+i7gX2rCoBO5erLvIOQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MURB4FGapz1xD88n2NABvBlRuZy9+g8ctesBsWCzUQMy4RxfPUku9nlYKpWOnrSmgyAPTrmKnZNtunWdZffptXgky01aOBalq6vd9iV++r/af3n02ieQsd5uvggFcAjhltq29hz+mLKTYd0YP1i6idcWrqqo5nyAaXegg3rq388= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=JH5Aq49b; arc=none smtp.client-ip=209.85.166.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="JH5Aq49b" Received: by mail-il1-f177.google.com with SMTP id e9e14a558f8ab-37636c3872bso26033395ab.3 for ; Wed, 21 Aug 2024 07:19:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1724249956; x=1724854756; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=GVnsGmbj3k+86VX6mvfNHtIqtKsh1mg4Q0PTn3axPDg=; b=JH5Aq49bRVx6QuG/aMDLPFbtzgCm/7Gn2AIGB2kmfF99pvYhjOCVYxS34SBp+QlzC2 Z0DS1OlE7/Loe4wlFBlPihc8h7N4Ptl2MuYgqzj7DJ5dxH8FKNci6pPxG9X6wprBUenU WUNygLA5xuQRVxhG7lS+hAkvzfS6sumsuKOGLt+0QTH/mPSsam99dEMUVK03oRlpoPgs sGyckGczw0McN6XJh9FTc/t2ebVftr36suGh9A2EHuF0jbrXW3OJ1Sgt2UWtp2EokWEy QUGPAfYb4f9MrUsx3LGncwTVegItB2ZLS5Y3WtjaoN0KMt7z4bom2M6szfLca6k4fRUj ODEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724249956; x=1724854756; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GVnsGmbj3k+86VX6mvfNHtIqtKsh1mg4Q0PTn3axPDg=; b=UJ1VrfeW00Y6L4chKanunwsZ6DbHVm3lYSQ89l2PEWnJsace5BmezpT+j5fjnhH3dt 4Iptc5domnRFDJp9Y+TWk6gPyk4mmAgJlx6UhnLkeB24ip2UpFIy3PiS8LCpaeOBWSg1 BOKYxWlSAoMPdYolL4Xyju9h7U8+BpevG3MlmuUhU04zRGet1HI3/yHWa3XtIPTtHjJR 0XZexXIoRLhb4XjK0p5eo2SRptic2e8A9s5Nbsh8vMTti6KjBfs8pJixNs7261EWzFiH k/+j2ZDh1h+ejo6OZFEW+XlFyZKYGAR6dnlnFoLexXGwtMirTpa8B80vsBE+5JDPBQAZ B93g== X-Gm-Message-State: AOJu0YxeOJJmOcWoIJqftP8SWNKVv9Dm/G/kPm2GGiNA0ECUFb4qdFaQ STqyd8Ag5aZjpUz0wuXceHAoe5hj8GrH+TzASYETHp7s5L/at1g/dfnRe2ncwSFICV2Mw84IyXX V X-Google-Smtp-Source: AGHT+IEqwHsIoSyZnluWkAyY0OLmXSNQn8BEkZm3eSxM2qQmaQDAWyCyiVrYqxSQfLjEIJh401YBPQ== X-Received: by 2002:a05:6e02:1fe5:b0:39d:25ef:541e with SMTP id e9e14a558f8ab-39d6c3c7820mr24231875ab.26.1724249955807; Wed, 21 Aug 2024 07:19:15 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id e9e14a558f8ab-39d1eb0bc93sm50967285ab.19.2024.08.21.07.19.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Aug 2024 07:19:14 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: dw@davidwei.uk, Jens Axboe Subject: [PATCH 3/5] io_uring: implement our own schedule timeout handling Date: Wed, 21 Aug 2024 08:16:24 -0600 Message-ID: <20240821141910.204660-4-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240821141910.204660-1-axboe@kernel.dk> References: <20240821141910.204660-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In preparation for having two distinct timeouts and avoid waking the task if we don't need to. Signed-off-by: Jens Axboe --- io_uring/io_uring.c | 37 ++++++++++++++++++++++++++++++++----- io_uring/io_uring.h | 2 ++ 2 files changed, 34 insertions(+), 5 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 9e2b8d4c05db..4ba5292137c3 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2322,7 +2322,7 @@ static int io_wake_function(struct wait_queue_entry *curr, unsigned int mode, * Cannot safely flush overflowed CQEs from here, ensure we wake up * the task, and the next invocation will do it. */ - if (io_should_wake(iowq) || io_has_work(iowq->ctx)) + if (io_should_wake(iowq) || io_has_work(iowq->ctx) || iowq->hit_timeout) return autoremove_wake_function(curr, mode, wake_flags, key); return -1; } @@ -2350,6 +2350,34 @@ static bool current_pending_io(void) return percpu_counter_read_positive(&tctx->inflight); } +static enum hrtimer_restart io_cqring_timer_wakeup(struct hrtimer *timer) +{ + struct io_wait_queue *iowq = container_of(timer, struct io_wait_queue, t); + + WRITE_ONCE(iowq->hit_timeout, 1); + wake_up_process(iowq->wq.private); + return HRTIMER_NORESTART; +} + +static int io_cqring_schedule_timeout(struct io_wait_queue *iowq, + clockid_t clock_id) +{ + iowq->hit_timeout = 0; + hrtimer_init_on_stack(&iowq->t, clock_id, HRTIMER_MODE_ABS); + iowq->t.function = io_cqring_timer_wakeup; + hrtimer_set_expires_range_ns(&iowq->t, iowq->timeout, 0); + hrtimer_start_expires(&iowq->t, HRTIMER_MODE_ABS); + + if (!READ_ONCE(iowq->hit_timeout)) + schedule(); + + hrtimer_cancel(&iowq->t); + destroy_hrtimer_on_stack(&iowq->t); + __set_current_state(TASK_RUNNING); + + return READ_ONCE(iowq->hit_timeout) ? -ETIME : 0; +} + static int __io_cqring_wait_schedule(struct io_ring_ctx *ctx, struct io_wait_queue *iowq) { @@ -2362,11 +2390,10 @@ static int __io_cqring_wait_schedule(struct io_ring_ctx *ctx, */ if (current_pending_io()) current->in_iowait = 1; - if (iowq->timeout == KTIME_MAX) + if (iowq->timeout != KTIME_MAX) + ret = io_cqring_schedule_timeout(iowq, ctx->clockid); + else schedule(); - else if (!schedule_hrtimeout_range_clock(&iowq->timeout, 0, - HRTIMER_MODE_ABS, ctx->clockid)) - ret = -ETIME; current->in_iowait = 0; return ret; } diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index 9935819f12b7..f95c1b080f4b 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -40,7 +40,9 @@ struct io_wait_queue { struct io_ring_ctx *ctx; unsigned cq_tail; unsigned nr_timeouts; + int hit_timeout; ktime_t timeout; + struct hrtimer t; #ifdef CONFIG_NET_RX_BUSY_POLL ktime_t napi_busy_poll_dt; From patchwork Wed Aug 21 14:16:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13771660 Received: from mail-il1-f174.google.com (mail-il1-f174.google.com [209.85.166.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 063B21B1D5F for ; Wed, 21 Aug 2024 14:19:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724249960; cv=none; b=r7Nin9ktw3bJn2Y+G6YlA1YMcSwI1wOQtyZkM9jlDqRlrlodBPqqAA6xSaZPCzKxFZQK57IAjw70voGfIVUeZt29isVwfzP9962Vp6Fo/kwmx9+jYnt5K1ziguK9f1njgzyX02FJRyB37DAt8sV8AZmQUaRjYRh0C7U2quH4MrI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724249960; c=relaxed/simple; bh=FV5pFO389XJGDJ5i0cxK0zGiAf3BDSjeGN6E/Mv6/AY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WtSyLrFzVI03KSySvL03sP8/IPs7svGjLR5hQgx0ahClXeZ/26yLJv+FdUqvUaxkVVxVcZD1CpZPQvgemSkzJEXtexuVozYVJPiyRcKF2scDsnnch7pwk7Tjnb+Q0oda0PMxIEcnG3aMZr510/cy6OCN3iKT2Rlhwp0USKk9ZFY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=YUps2iKv; arc=none smtp.client-ip=209.85.166.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="YUps2iKv" Received: by mail-il1-f174.google.com with SMTP id e9e14a558f8ab-39d2cea1239so18273725ab.3 for ; Wed, 21 Aug 2024 07:19:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1724249956; x=1724854756; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+39bVTxzdD0lU+3+/bddoRvGEsUHfz1Wtam711ub4Co=; b=YUps2iKv6sILW2sz8tI4ggzdqBRGViuSja04MYWbGfBuzKob+gFBQjR60rgJl670E/ jSkKe5E1UjprLpFj/hyzVjm7P7qllRitSbh/70Yaf0yfCkaRH+J2Rcsl5XpUJuPntdY/ EmtJ3Y7t7QYdd4zCMzDW1zECta2yJsEVKQldPf+dwZN0VfwPoCBXAzSHQC3+QfufZUYZ 7uCAg5Cs1iSD1n5u0XP4ULQqboNBxCVI6OIXsJTmNxLA5yXLT4TOMj40rXgOfsPOAWk4 xuQtb/UbWbNXEKzGbV8zy7V3XNzTMIFhyx9FjnlhHuwVJ96cSlw8s/5AbjQFAo0oalAh Zd9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724249956; x=1724854756; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+39bVTxzdD0lU+3+/bddoRvGEsUHfz1Wtam711ub4Co=; b=SbGGxiO3Rz+hIYpP7HUuySQWJH70O6JkwRptyA0KyFZAoiA0IAAv18a17s0QNf9fp7 P++d0yFsK2WJbFtkHpeI6jRriD8vlzo7Vjtm60DLZ7BSUs5Eu5JLuqPVXUutBLr1f1Am O2ZCBG5Sq3Yb0KbV0unJY4rY8JiMEHN615OSoRdpZpl2fgsW9+l5exS/wGf9ZjJXmAbn owQuRKu6q5CdIQib5s2H/fZJCGD/bVVWdYq8w6NJh8F6uP5akdkdE0Zo0qbG/saTfbo5 HfqkFsT1GUHv1sOxdJcTcDq+xoet+OnnQ9tZaaxv7QEdH0n9yJJ+D7fBPl8cu8PtPZEt S7sQ== X-Gm-Message-State: AOJu0YyK2dULXZ4+dJD4tdJ8SPqBlq36bgBHJUrGCzwGmpwNnSfF/qjy Hi4xohr8o5hHV6vQwOWASr72J3MfI9gwrdj5cFtPXP6uyKxubLoJBAjlNxYrOhjcT92tpmCe0NK 3 X-Google-Smtp-Source: AGHT+IEQcnOYHBB3ZrljEzfnHkIHL4Xf24i2ml2oMr0nJ2H34VrY5cGvIPl9pYz/El0eVw6JbfKBkw== X-Received: by 2002:a05:6e02:1e06:b0:39d:47cf:2c7f with SMTP id e9e14a558f8ab-39d6c3c9d31mr27055595ab.24.1724249956550; Wed, 21 Aug 2024 07:19:16 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id e9e14a558f8ab-39d1eb0bc93sm50967285ab.19.2024.08.21.07.19.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Aug 2024 07:19:16 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: dw@davidwei.uk, Jens Axboe Subject: [PATCH 4/5] io_uring: add support for batch wait timeout Date: Wed, 21 Aug 2024 08:16:25 -0600 Message-ID: <20240821141910.204660-5-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240821141910.204660-1-axboe@kernel.dk> References: <20240821141910.204660-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Waiting for events with io_uring has two knobs that can be set: 1) The number of events to wake for 2) The timeout associated with the event Waiting will abort when either of those conditions are met, as expected. This adds support for a third event, which is associated with the number of events to wait for. Applications generally like to handle batches of completions, and right now they'd set a number of events to wait for and the timeout for that. If no events have been received but the timeout triggers, control is returned to the application and it can wait again. However, if the application doesn't have anything to do until events are reaped, then it's possible to make this waiting more efficient. For example, the application may have a latency time of 50 usecs and wanting to handle a batch of 8 requests at the time. If it uses 50 usecs as the timeout, then it'll be doing 20K context switches per second even if nothing is happening. This introduces the notion of min batch wait time. If the min batch wait time expires, then we'll return to userspace if we have any events at all. If none are available, the general wait time is applied. Any request arriving after the min batch wait time will cause waiting to stop and return control to the application. Signed-off-by: Jens Axboe --- io_uring/io_uring.c | 88 ++++++++++++++++++++++++++++++++++++++------- io_uring/io_uring.h | 2 ++ 2 files changed, 77 insertions(+), 13 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 4ba5292137c3..87e7cf6551d7 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2322,7 +2322,8 @@ static int io_wake_function(struct wait_queue_entry *curr, unsigned int mode, * Cannot safely flush overflowed CQEs from here, ensure we wake up * the task, and the next invocation will do it. */ - if (io_should_wake(iowq) || io_has_work(iowq->ctx) || iowq->hit_timeout) + if (io_should_wake(iowq) || io_has_work(iowq->ctx) || + READ_ONCE(iowq->hit_timeout)) return autoremove_wake_function(curr, mode, wake_flags, key); return -1; } @@ -2359,13 +2360,66 @@ static enum hrtimer_restart io_cqring_timer_wakeup(struct hrtimer *timer) return HRTIMER_NORESTART; } +/* + * Doing min_timeout portion. If we saw any timeouts, events, or have work, + * wake up. If not, and we have a normal timeout, switch to that and keep + * sleeping. + */ +static enum hrtimer_restart io_cqring_min_timer_wakeup(struct hrtimer *timer) +{ + struct io_wait_queue *iowq = container_of(timer, struct io_wait_queue, t); + struct io_ring_ctx *ctx = iowq->ctx; + + /* no general timeout, or shorter, we are done */ + if (iowq->timeout == KTIME_MAX || + ktime_after(iowq->min_timeout, iowq->timeout)) + goto out_wake; + /* work we may need to run, wake function will see if we need to wake */ + if (io_has_work(ctx)) + goto out_wake; + /* got events since we started waiting, min timeout is done */ + if (iowq->cq_min_tail != READ_ONCE(ctx->rings->cq.tail)) + goto out_wake; + /* if we have any events and min timeout expired, we're done */ + if (io_cqring_events(ctx)) + goto out_wake; + + /* + * If using deferred task_work running and application is waiting on + * more than one request, ensure we reset it now where we are switching + * to normal sleeps. Any request completion post min_wait should wake + * the task and return. + */ + if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) { + atomic_set(&ctx->cq_wait_nr, 1); + smp_mb(); + if (!llist_empty(&ctx->work_llist)) + goto out_wake; + } + + iowq->t.function = io_cqring_timer_wakeup; + hrtimer_set_expires(timer, iowq->timeout); + return HRTIMER_RESTART; +out_wake: + return io_cqring_timer_wakeup(timer); +} + static int io_cqring_schedule_timeout(struct io_wait_queue *iowq, - clockid_t clock_id) + clockid_t clock_id, ktime_t start_time) { - iowq->hit_timeout = 0; + ktime_t timeout; + + WRITE_ONCE(iowq->hit_timeout, 0); hrtimer_init_on_stack(&iowq->t, clock_id, HRTIMER_MODE_ABS); - iowq->t.function = io_cqring_timer_wakeup; - hrtimer_set_expires_range_ns(&iowq->t, iowq->timeout, 0); + if (iowq->min_timeout) { + timeout = ktime_add_ns(iowq->min_timeout, start_time); + iowq->t.function = io_cqring_min_timer_wakeup; + } else { + timeout = iowq->timeout; + iowq->t.function = io_cqring_timer_wakeup; + } + + hrtimer_set_expires_range_ns(&iowq->t, timeout, 0); hrtimer_start_expires(&iowq->t, HRTIMER_MODE_ABS); if (!READ_ONCE(iowq->hit_timeout)) @@ -2379,7 +2433,8 @@ static int io_cqring_schedule_timeout(struct io_wait_queue *iowq, } static int __io_cqring_wait_schedule(struct io_ring_ctx *ctx, - struct io_wait_queue *iowq) + struct io_wait_queue *iowq, + ktime_t start_time) { int ret = 0; @@ -2390,8 +2445,8 @@ static int __io_cqring_wait_schedule(struct io_ring_ctx *ctx, */ if (current_pending_io()) current->in_iowait = 1; - if (iowq->timeout != KTIME_MAX) - ret = io_cqring_schedule_timeout(iowq, ctx->clockid); + if (iowq->timeout != KTIME_MAX || iowq->min_timeout != KTIME_MAX) + ret = io_cqring_schedule_timeout(iowq, ctx->clockid, start_time); else schedule(); current->in_iowait = 0; @@ -2400,7 +2455,8 @@ static int __io_cqring_wait_schedule(struct io_ring_ctx *ctx, /* If this returns > 0, the caller should retry */ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, - struct io_wait_queue *iowq) + struct io_wait_queue *iowq, + ktime_t start_time) { if (unlikely(READ_ONCE(ctx->check_cq))) return 1; @@ -2413,7 +2469,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, if (unlikely(io_should_wake(iowq))) return 0; - return __io_cqring_wait_schedule(ctx, iowq); + return __io_cqring_wait_schedule(ctx, iowq, start_time); } struct ext_arg { @@ -2431,6 +2487,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, { struct io_wait_queue iowq; struct io_rings *rings = ctx->rings; + ktime_t start_time; int ret; if (!io_allowed_run_tw(ctx)) @@ -2449,8 +2506,11 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, INIT_LIST_HEAD(&iowq.wq.entry); iowq.ctx = ctx; iowq.nr_timeouts = atomic_read(&ctx->cq_timeouts); + iowq.cq_min_tail = READ_ONCE(ctx->rings->cq.tail); iowq.cq_tail = READ_ONCE(ctx->rings->cq.head) + min_events; + iowq.min_timeout = 0; iowq.timeout = KTIME_MAX; + start_time = io_get_time(ctx); if (ext_arg->ts) { struct timespec64 ts; @@ -2460,7 +2520,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, iowq.timeout = timespec64_to_ktime(ts); if (!(flags & IORING_ENTER_ABS_TIMER)) - iowq.timeout = ktime_add(iowq.timeout, io_get_time(ctx)); + iowq.timeout = ktime_add(iowq.timeout, start_time); } if (ext_arg->sig) { @@ -2484,14 +2544,16 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, unsigned long check_cq; if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) { - atomic_set(&ctx->cq_wait_nr, nr_wait); + /* if min timeout has been hit, don't reset wait count */ + if (!READ_ONCE(iowq.hit_timeout)) + atomic_set(&ctx->cq_wait_nr, nr_wait); set_current_state(TASK_INTERRUPTIBLE); } else { prepare_to_wait_exclusive(&ctx->cq_wait, &iowq.wq, TASK_INTERRUPTIBLE); } - ret = io_cqring_wait_schedule(ctx, &iowq); + ret = io_cqring_wait_schedule(ctx, &iowq, start_time); __set_current_state(TASK_RUNNING); atomic_set(&ctx->cq_wait_nr, IO_CQ_WAKE_INIT); diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index f95c1b080f4b..65078e641390 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -39,8 +39,10 @@ struct io_wait_queue { struct wait_queue_entry wq; struct io_ring_ctx *ctx; unsigned cq_tail; + unsigned cq_min_tail; unsigned nr_timeouts; int hit_timeout; + ktime_t min_timeout; ktime_t timeout; struct hrtimer t; From patchwork Wed Aug 21 14:16:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13771661 Received: from mail-il1-f181.google.com (mail-il1-f181.google.com [209.85.166.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9F9CF1B1D67 for ; Wed, 21 Aug 2024 14:19:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724249961; cv=none; b=ngbhmYii3EEy88lyBwhEo9+jmekiTFFDOyebRPXVeodqc5y/75CnHxwmt7MCNGZHbKlDWt7qYNzX38UR102Mpzns0gieawtCYN6lUKrunZpFua09uj8pot5HjoqJB4iwi8x8kip213EtVn1Czmm5mL5hC0dN+rGUO/m241M07jQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724249961; c=relaxed/simple; bh=qTu2sq7yn2EcrdeyktCwvGZiekwXDzf7KiZE3ztBGg0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fPu6ayUs0TAyH0wFsDFJ2EtLGIgb6MYmlyhNC3dQfaMveEJxeVzciEVWXQxGPHz/gFERb/3wpoTRg8W0LMErVRQ63PF+zhLfD2erdWeQBAe2O6dra9utMyzwR858cmpKRtIgZ+njLDqZRZjMrXjx1X8WPz9AY+MyogsDsWTb+6o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=hmhNfnxK; arc=none smtp.client-ip=209.85.166.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="hmhNfnxK" Received: by mail-il1-f181.google.com with SMTP id e9e14a558f8ab-39d34be8ca2so14960635ab.3 for ; Wed, 21 Aug 2024 07:19:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1724249958; x=1724854758; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=JTP5Qj5lqGHnxMoth1w7TFkhoBRC9gXIAWqqk84GW5Q=; b=hmhNfnxKOLlu3I+oieYh2jSVbZ1T7Hl+OGbnA2tHz8XEuGSxVnGDocBkMkkUhT/l8t qT48bwCs0PLtX6ImqEaYKtBK0UlXWxelk6Z5AFcBmYt9dsUBzMvT3eu7ClSpwKnn6fnt oxQ5Lk8MzLQc475IfcPNkJtTUk5yANYbhi5KTfp1C3KeJXj82Mi0L+wfhMm2QHg+0oY5 Jp3eN5iqJ81+nmUID84pCXo3qSqsJcur99eWFcFMu8ny80xVYQuEurVaOOvLHhBX6J/x kFSSX/XhTBxTcDi6J38AtjAZOML09rTDWUIadJlaAnfrlTGBsbBXzsFpMQOJAcK7bdLK oNVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724249958; x=1724854758; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JTP5Qj5lqGHnxMoth1w7TFkhoBRC9gXIAWqqk84GW5Q=; b=kycDlLKvsMGJRIFi2VNnZBbBbtk+pQFA5rrC/Kiso6FY2AZP2E60YXp2wFvDQ25nw5 Q3OfsTQDzMPjWO7LeAW6We7EIs1/60b3/9IPIMWexZW8T8FsqvqSry/Dtp9UteteZ8/x 0b5+U2pxbtvbt/o3BxRS7j/e8h5HVjrN76uWqDxmpoEOUqJdkdxH4oANHMqMn9Gqj+Cv jbEAMtdx7t1eWxMnjd9WWOZY+OQK3Jg9P/UVfF41uvPw7JB3h0YIzC72zAXHK3KgK9n6 Pce6VmZ5RBfa6dVtC4hcDg9QGWNlQ9dmxn2VXhM/JAaXaxvbN2wMUREz//NkCkQARE9K o7EA== X-Gm-Message-State: AOJu0YwsEp7OOCC4XTaS/0AhEsHIoHptxyGEoghV0Gn8LI2Mm7NQSsf7 NReRcjZHwFOpozorl/+NISPZSBS2gX/RAShpaqwJrsVSqj35qJFuFRhmI05OLbjNUYxtdhc6Lwx 2 X-Google-Smtp-Source: AGHT+IFlNCYpa4qWoXTqcPfpbKq3KeJsd/W+NaC9UdXaucz99r31imD6l4EfeZ1yT4Y8KrZVnlKDgQ== X-Received: by 2002:a05:6e02:1384:b0:376:3e9c:d9a8 with SMTP id e9e14a558f8ab-39d6c35a178mr27463935ab.9.1724249958133; Wed, 21 Aug 2024 07:19:18 -0700 (PDT) Received: from localhost.localdomain ([96.43.243.2]) by smtp.gmail.com with ESMTPSA id e9e14a558f8ab-39d1eb0bc93sm50967285ab.19.2024.08.21.07.19.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Aug 2024 07:19:16 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: dw@davidwei.uk, Jens Axboe Subject: [PATCH 5/5] io_uring: wire up min batch wake timeout Date: Wed, 21 Aug 2024 08:16:26 -0600 Message-ID: <20240821141910.204660-6-axboe@kernel.dk> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240821141910.204660-1-axboe@kernel.dk> References: <20240821141910.204660-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Expose min_wait_usec in io_uring_getevents_arg, replacing the pad member that is currently in there. The value is in usecs, which is explained in the name as well. Note that if min_wait_usec and a normal timeout is used in conjunction, the normal timeout is still relative to the base time. For example, if min_wait_usec is set to 100 and the normal timeout is 1000, the max total time waited is still 1000. This also means that if the normal timeout is shorter than min_wait_usec, then only the min_wait_usec will take effect. See previous commit for an explanation of how this works. IORING_FEAT_MIN_TIMEOUT is added as a feature flag for this, as applications doing submit_and_wait_timeout() style operations will generally not see the -EINVAL from the wait side as they return the number of IOs submitted. Only if no IOs are submitted will the -EINVAL bubble back up to the application. Signed-off-by: Jens Axboe --- include/uapi/linux/io_uring.h | 3 ++- io_uring/io_uring.c | 8 ++++---- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 7af716136df9..042eab793e26 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -543,6 +543,7 @@ struct io_uring_params { #define IORING_FEAT_LINKED_FILE (1U << 12) #define IORING_FEAT_REG_REG_RING (1U << 13) #define IORING_FEAT_RECVSEND_BUNDLE (1U << 14) +#define IORING_FEAT_MIN_TIMEOUT (1U << 15) /* * io_uring_register(2) opcodes and arguments @@ -766,7 +767,7 @@ enum io_uring_register_restriction_op { struct io_uring_getevents_arg { __u64 sigmask; __u32 sigmask_sz; - __u32 pad; + __u32 min_wait_usec; __u64 ts; }; diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 87e7cf6551d7..03b226689e20 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2476,6 +2476,7 @@ struct ext_arg { size_t argsz; struct __kernel_timespec __user *ts; const sigset_t __user *sig; + ktime_t min_time; }; /* @@ -2508,7 +2509,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, iowq.nr_timeouts = atomic_read(&ctx->cq_timeouts); iowq.cq_min_tail = READ_ONCE(ctx->rings->cq.tail); iowq.cq_tail = READ_ONCE(ctx->rings->cq.head) + min_events; - iowq.min_timeout = 0; + iowq.min_timeout = ext_arg->min_time; iowq.timeout = KTIME_MAX; start_time = io_get_time(ctx); @@ -3234,8 +3235,7 @@ static int io_get_ext_arg(unsigned flags, const void __user *argp, return -EINVAL; if (copy_from_user(&arg, argp, sizeof(arg))) return -EFAULT; - if (arg.pad) - return -EINVAL; + ext_arg->min_time = arg.min_wait_usec * NSEC_PER_USEC; ext_arg->sig = u64_to_user_ptr(arg.sigmask); ext_arg->argsz = arg.sigmask_sz; ext_arg->ts = u64_to_user_ptr(arg.ts); @@ -3636,7 +3636,7 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p, IORING_FEAT_EXT_ARG | IORING_FEAT_NATIVE_WORKERS | IORING_FEAT_RSRC_TAGS | IORING_FEAT_CQE_SKIP | IORING_FEAT_LINKED_FILE | IORING_FEAT_REG_REG_RING | - IORING_FEAT_RECVSEND_BUNDLE; + IORING_FEAT_RECVSEND_BUNDLE | IORING_FEAT_MIN_TIMEOUT; if (copy_to_user(params, p, sizeof(*p))) { ret = -EFAULT;