From patchwork Tue Oct 22 02:08:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13844989 Received: from mail-pg1-f171.google.com (mail-pg1-f171.google.com [209.85.215.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7A767132114 for ; Tue, 22 Oct 2024 02:12:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729563129; cv=none; b=Bvqv5xO7kDpst4ae+g1b/fm5Vj0+wB3DjvRiofnfadk97Q97nnSJkTEPKF/7eYpqUq/UncEBpS/pYo9pSeJD3V6H0HLLVpIK/VvFurYF2iburm8Xh8qJUuoe+p85/EVZvwXzY8j2SzMsb5wTvNWXy9N8ZhIdu67Wzfy+3K4/AXw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729563129; c=relaxed/simple; bh=eAiFpPAvsaYS5b4WbjNYVCeDTzqN4Sf1Y8gYz+2RCoM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Qwqjj6l2k3EuOM/nAaKJkSI5ABwlL/z4CT/mA3YOJX9slzANlEiabX9z2CfJZ1smE60GLbxon87v1jEEulSJQ0KCr2oHTT8pkYtiU0cowpokV+KP2ocP6OuaG7qnV9HfRWPBYgW8RV8AVUADLT7MDMylE4SZCn5VL/odidG+Nec= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=OHpk4MoY; arc=none smtp.client-ip=209.85.215.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="OHpk4MoY" Received: by mail-pg1-f171.google.com with SMTP id 41be03b00d2f7-7ea7e2ff5ceso3913507a12.2 for ; Mon, 21 Oct 2024 19:12:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1729563126; x=1730167926; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QifmuGZt0bFh+Z2AH/dkkDK+RN42GNXgBlCpLjpKI7Y=; b=OHpk4MoY005NYMRXVWsFKgkkv6CsH9GCxWIhMG2mmPMCOJemiuvHyXL2XPO7+8Am2J k8iZ3g2qMJu6sZKQ2/rj/PmxoSa6HUz2mh0GBYztvxoDGPzXnY2HTGtMSypE8WoRP8W6 WmYddhdUEo0yvX6LX4voHsd1pglJNvIBVp4+frMds44UBVkLYIUdbLxgM8l1044oskPN aCe+vxQGbxPDmBR24HXeuBphT7h5FdAoaDLyq8m0xMQ4LnhRH6i1ot/biEMIu3WRtzXJ AXfuKKUNbbz7e1W1CqNstyyrZr0aTiCdiW5+y319IjVpNIgERdhs3GRo2jFYqe4YRiub gw6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729563126; x=1730167926; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QifmuGZt0bFh+Z2AH/dkkDK+RN42GNXgBlCpLjpKI7Y=; b=r+OW9SNWB+owU8VSF+zB/ONOwZqA3Pa1WxxzUu47DIxL2XlVy7eDC8YJU3b0QpWJqH 7H680+HQ2GQqRMpkLD6AQnen/alsKxFxL3fnye2djfE6n/u5ZV85HpLinvbWNeDDba9a VM6/NU8bKGYxTYHyoiCZs4N4a5ORD5BZ/KbE/86mniFLkqPrnsg+LUvL/mAaG2uJSMX2 477AUZVdLPL+Fj1o9g9s1tmXvEuRUBASJcgfgMqrB5BYVz1VAy9RwvYc9icrTnzZyk69 70vPQsXpVtE97RjRS2oQHUQmCpcSWrWMsq97qXH9iSf6Up9wfheyizCUeeWcFfs1vrQZ rsew== X-Gm-Message-State: AOJu0YzkO3qWC1O2AB5eAUEvnrP9besaqbncyLcXhtCnL7NjsU378a6W 9jvJqObPv7oNALOY7/Ek+A9oV2C5MQftXKixA8OPTwu/RQ6YGptFFc4451fGOtqBjS6a1U+VmWa y X-Google-Smtp-Source: AGHT+IFEOe+mT/omPgjJ48sFjTJYf1o4s/x22UW5Qq40w5mXrAKpCmKZa4A4wUyb3hZLPZ+KiBQdiw== X-Received: by 2002:a05:6a21:2fc7:b0:1d9:d04:586d with SMTP id adf61e73a8af0-1d96b83f27emr2433533637.38.1729563126255; Mon, 21 Oct 2024 19:12:06 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71ec131477asm3747060b3a.10.2024.10.21.19.12.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Oct 2024 19:12:05 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 1/3] io_uring: move max entry definition and ring sizing into header Date: Mon, 21 Oct 2024 20:08:28 -0600 Message-ID: <20241022021159.820925-2-axboe@kernel.dk> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241022021159.820925-1-axboe@kernel.dk> References: <20241022021159.820925-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In preparation for needing this somewhere else, move the definitions for the maximum CQ and SQ ring size into io_uring.h. Make the rings_size() helper available as well, and have it take just the setup flags argument rather than the fill ring pointer. That's all that is needed. Signed-off-by: Jens Axboe --- io_uring/io_uring.c | 14 ++++++-------- io_uring/io_uring.h | 5 +++++ 2 files changed, 11 insertions(+), 8 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 58b401900b41..6dea5242d666 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -105,9 +105,6 @@ #include "alloc_cache.h" #include "eventfd.h" -#define IORING_MAX_ENTRIES 32768 -#define IORING_MAX_CQ_ENTRIES (2 * IORING_MAX_ENTRIES) - #define SQE_COMMON_FLAGS (IOSQE_FIXED_FILE | IOSQE_IO_LINK | \ IOSQE_IO_HARDLINK | IOSQE_ASYNC) @@ -2667,8 +2664,8 @@ static void io_rings_free(struct io_ring_ctx *ctx) ctx->sq_sqes = NULL; } -static unsigned long rings_size(struct io_ring_ctx *ctx, unsigned int sq_entries, - unsigned int cq_entries, size_t *sq_offset) +unsigned long rings_size(unsigned int flags, unsigned int sq_entries, + unsigned int cq_entries, size_t *sq_offset) { struct io_rings *rings; size_t off, sq_array_size; @@ -2676,7 +2673,7 @@ static unsigned long rings_size(struct io_ring_ctx *ctx, unsigned int sq_entries off = struct_size(rings, cqes, cq_entries); if (off == SIZE_MAX) return SIZE_MAX; - if (ctx->flags & IORING_SETUP_CQE32) { + if (flags & IORING_SETUP_CQE32) { if (check_shl_overflow(off, 1, &off)) return SIZE_MAX; } @@ -2687,7 +2684,7 @@ static unsigned long rings_size(struct io_ring_ctx *ctx, unsigned int sq_entries return SIZE_MAX; #endif - if (ctx->flags & IORING_SETUP_NO_SQARRAY) { + if (flags & IORING_SETUP_NO_SQARRAY) { *sq_offset = SIZE_MAX; return off; } @@ -3434,7 +3431,8 @@ static __cold int io_allocate_scq_urings(struct io_ring_ctx *ctx, ctx->sq_entries = p->sq_entries; ctx->cq_entries = p->cq_entries; - size = rings_size(ctx, p->sq_entries, p->cq_entries, &sq_array_offset); + size = rings_size(ctx->flags, p->sq_entries, p->cq_entries, + &sq_array_offset); if (size == SIZE_MAX) return -EOVERFLOW; diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index 9cd9a127e9ed..4a471a810f02 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -65,6 +65,11 @@ static inline bool io_should_wake(struct io_wait_queue *iowq) return dist >= 0 || atomic_read(&ctx->cq_timeouts) != iowq->nr_timeouts; } +#define IORING_MAX_ENTRIES 32768 +#define IORING_MAX_CQ_ENTRIES (2 * IORING_MAX_ENTRIES) + +unsigned long rings_size(unsigned int flags, unsigned int sq_entries, + unsigned int cq_entries, size_t *sq_offset); bool io_cqe_cache_refill(struct io_ring_ctx *ctx, bool overflow); int io_run_task_work_sig(struct io_ring_ctx *ctx); void io_req_defer_failed(struct io_kiocb *req, s32 res); From patchwork Tue Oct 22 02:08:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13844990 Received: from mail-pg1-f177.google.com (mail-pg1-f177.google.com [209.85.215.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A54A12D75C for ; Tue, 22 Oct 2024 02:12:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729563130; cv=none; b=EcPMUXZ2s9vOkEYjXZsP+QaxQ0BD824RpSkCUY5fjoXTVb9L4Kbn/ZKiSU5vkiPuRZ6sFrtJtWVcAX2wnaKAVciLPfA7wH4CYYcyoY6ks+WrhqaTWEHwIKB1HzxnMc3GghMdAfe8zpz4rZxja/hDyoCnRrg98cF9dPbbOGmkEtI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729563130; c=relaxed/simple; bh=j3VmYIUYlyBTGlqvUIMc0JUgVTjmWoaF5EMC9Dn7Zk0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KJymP1USk/XGPavXPFeG1XtzzodIIC0MdR0JHOlMhfQ+W3jkWGAglKd0TXClIKKOnfr7vdry6ArkPpajMKIK0K24TE0XwjpJtpQEdNJM7Fr4AcYfZB/YcvVEZLGNFnc6a+vEbe1UT7g2PMtb4lfffFnoNajNoKLwiGuRrUU44cE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=OuSKkTCc; arc=none smtp.client-ip=209.85.215.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="OuSKkTCc" Received: by mail-pg1-f177.google.com with SMTP id 41be03b00d2f7-7ea7ad1e01fso3419860a12.0 for ; Mon, 21 Oct 2024 19:12:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1729563128; x=1730167928; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5+0vcL80oErDAtwoUFPo8e5XBLS2uO6xNRw7O3/obNg=; b=OuSKkTCciu+0vb/caJw1SNSyLp6UCsbU4j3srXpwqFN06a+5SD9fmmgevzyrKa9Ntu RtQX/M1Kos8IqpHU+f2uHCn+GICvksJ5lJXIZw3NQtWXjWTqUAvv2LmbA/ufUCnKBA8w gldkdkeiw/dRp1AYUIeGK7bRF4eG8SaLIldvWYFlS8VE9b/hvDMQhUUJb0HgbL6gSp3m KwjbgwZbasPcG3d3BUybQUGvDcx19KFb982e0LhCQlm1vLd582WwuQVR6PNfRPpjVun/ uyS8T1f8PYxRY+eVn8k31UITMTrzsy7j5OkV11vndOmWn5J7LauTsBGJwYWKIcEspsNp 68MA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729563128; x=1730167928; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5+0vcL80oErDAtwoUFPo8e5XBLS2uO6xNRw7O3/obNg=; b=e8giZ8dK+MSrYqZVUt7UDspg2QaWySj9fm50Vf2uuA2HqhXodoawc548sxfUwBRUXl 8yiFEjnv6wduJh7cSdMVZQe3YP3DJSv5rTciqIbWn0gYE5tStRyEe6SVIS0vvNeEhNK9 PczYRIRx4IMfeNQdcHE+aKkludWYsIgXWtOBUqoiS15dGBw4lLs3MwjF/VHi3ijAK08S D5JjhpXWbNzay8lpojnTU5sZCc0fputdkAzpgs+7trHrFtRunpvsYJjVZVz/D7m24G8g SmC6QZt4tZVMgfCw7bPg20mYFBX1ZIGnopeOXWHm2LstJOynuVDfRLGtlYwNefKIYXwe kq5A== X-Gm-Message-State: AOJu0YwcoGb0MYo3NS/oIjNleBOYnAbyCnKXcFxYRJEitdR6mMPDawBN kUixvV11OOhgpGpWml1d2oDbj63PVwKMqLJjWawVWdBhiuVIe/kC1CSP/VXeB+yjnrvee+NcNlG I X-Google-Smtp-Source: AGHT+IGn9tC7VKQNM0RblVrGLjgt81xbCA7xPZe5fLbsg2c/lSqa4dKcyeTvAFaUFwjKvQ//bfCW+g== X-Received: by 2002:a05:6a20:d045:b0:1d9:a94:feec with SMTP id adf61e73a8af0-1d96c261e4fmr2820300637.2.1729563128107; Mon, 21 Oct 2024 19:12:08 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71ec131477asm3747060b3a.10.2024.10.21.19.12.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Oct 2024 19:12:07 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 2/3] io_uring: abstract out a bit of the ring filling logic Date: Mon, 21 Oct 2024 20:08:29 -0600 Message-ID: <20241022021159.820925-3-axboe@kernel.dk> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241022021159.820925-1-axboe@kernel.dk> References: <20241022021159.820925-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Abstract out a io_uring_fill_params() helper, which fills out the necessary bits of struct io_uring_params. Add it to io_uring.h as well, in preparation for having another internal user of it. Signed-off-by: Jens Axboe --- io_uring/io_uring.c | 70 ++++++++++++++++++++++++++------------------- io_uring/io_uring.h | 1 + 2 files changed, 41 insertions(+), 30 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 6dea5242d666..b5974bdad48b 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3498,14 +3498,8 @@ static struct file *io_uring_get_file(struct io_ring_ctx *ctx) O_RDWR | O_CLOEXEC, NULL); } -static __cold int io_uring_create(unsigned entries, struct io_uring_params *p, - struct io_uring_params __user *params) +int io_uring_fill_params(unsigned entries, struct io_uring_params *p) { - struct io_ring_ctx *ctx; - struct io_uring_task *tctx; - struct file *file; - int ret; - if (!entries) return -EINVAL; if (entries > IORING_MAX_ENTRIES) { @@ -3547,6 +3541,42 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p, p->cq_entries = 2 * p->sq_entries; } + p->sq_off.head = offsetof(struct io_rings, sq.head); + p->sq_off.tail = offsetof(struct io_rings, sq.tail); + p->sq_off.ring_mask = offsetof(struct io_rings, sq_ring_mask); + p->sq_off.ring_entries = offsetof(struct io_rings, sq_ring_entries); + p->sq_off.flags = offsetof(struct io_rings, sq_flags); + p->sq_off.dropped = offsetof(struct io_rings, sq_dropped); + p->sq_off.resv1 = 0; + if (!(p->flags & IORING_SETUP_NO_MMAP)) + p->sq_off.user_addr = 0; + + p->cq_off.head = offsetof(struct io_rings, cq.head); + p->cq_off.tail = offsetof(struct io_rings, cq.tail); + p->cq_off.ring_mask = offsetof(struct io_rings, cq_ring_mask); + p->cq_off.ring_entries = offsetof(struct io_rings, cq_ring_entries); + p->cq_off.overflow = offsetof(struct io_rings, cq_overflow); + p->cq_off.cqes = offsetof(struct io_rings, cqes); + p->cq_off.flags = offsetof(struct io_rings, cq_flags); + p->cq_off.resv1 = 0; + if (!(p->flags & IORING_SETUP_NO_MMAP)) + p->cq_off.user_addr = 0; + + return 0; +} + +static __cold int io_uring_create(unsigned entries, struct io_uring_params *p, + struct io_uring_params __user *params) +{ + struct io_ring_ctx *ctx; + struct io_uring_task *tctx; + struct file *file; + int ret; + + ret = io_uring_fill_params(entries, p); + if (unlikely(ret)) + return ret; + ctx = io_ring_ctx_alloc(p); if (!ctx) return -ENOMEM; @@ -3630,6 +3660,9 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p, if (ret) goto err; + if (!(p->flags & IORING_SETUP_NO_SQARRAY)) + p->sq_off.array = (char *)ctx->sq_array - (char *)ctx->rings; + ret = io_sq_offload_create(ctx, p); if (ret) goto err; @@ -3638,29 +3671,6 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p, if (ret) goto err; - p->sq_off.head = offsetof(struct io_rings, sq.head); - p->sq_off.tail = offsetof(struct io_rings, sq.tail); - p->sq_off.ring_mask = offsetof(struct io_rings, sq_ring_mask); - p->sq_off.ring_entries = offsetof(struct io_rings, sq_ring_entries); - p->sq_off.flags = offsetof(struct io_rings, sq_flags); - p->sq_off.dropped = offsetof(struct io_rings, sq_dropped); - if (!(ctx->flags & IORING_SETUP_NO_SQARRAY)) - p->sq_off.array = (char *)ctx->sq_array - (char *)ctx->rings; - p->sq_off.resv1 = 0; - if (!(ctx->flags & IORING_SETUP_NO_MMAP)) - p->sq_off.user_addr = 0; - - p->cq_off.head = offsetof(struct io_rings, cq.head); - p->cq_off.tail = offsetof(struct io_rings, cq.tail); - p->cq_off.ring_mask = offsetof(struct io_rings, cq_ring_mask); - p->cq_off.ring_entries = offsetof(struct io_rings, cq_ring_entries); - p->cq_off.overflow = offsetof(struct io_rings, cq_overflow); - p->cq_off.cqes = offsetof(struct io_rings, cqes); - p->cq_off.flags = offsetof(struct io_rings, cq_flags); - p->cq_off.resv1 = 0; - if (!(ctx->flags & IORING_SETUP_NO_MMAP)) - p->cq_off.user_addr = 0; - p->features = IORING_FEAT_SINGLE_MMAP | IORING_FEAT_NODROP | IORING_FEAT_SUBMIT_STABLE | IORING_FEAT_RW_CUR_POS | IORING_FEAT_CUR_PERSONALITY | IORING_FEAT_FAST_POLL | diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index 4a471a810f02..e3e6cb14de5d 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -70,6 +70,7 @@ static inline bool io_should_wake(struct io_wait_queue *iowq) unsigned long rings_size(unsigned int flags, unsigned int sq_entries, unsigned int cq_entries, size_t *sq_offset); +int io_uring_fill_params(unsigned entries, struct io_uring_params *p); bool io_cqe_cache_refill(struct io_ring_ctx *ctx, bool overflow); int io_run_task_work_sig(struct io_ring_ctx *ctx); void io_req_defer_failed(struct io_kiocb *req, s32 res); From patchwork Tue Oct 22 02:08:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13844991 Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 28DFA132114 for ; Tue, 22 Oct 2024 02:12:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729563133; cv=none; b=JPau3zWicL/5x4KkmN7UsFDrmsEUH1lagJLrJ6PHWZ6iUkK3PVuIqD9H8vV0kfg8p/o+rO9KNXaG2W/eTjiTUrbjSPxphZOgjTL8scrNDHQjL06R8X2KLn/zUzXavpU/DJSpI/h0GgSkKLSSqDkyjbM2KusLnL5rocNKRyvTLdM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729563133; c=relaxed/simple; bh=AT6C9bAkcS+mJ74Sh7lR2VhIyhVOU3KSbJOjgyqCdM8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=mna7m/xPBQV4JWWBIK6DAs61RoNNtVblUsnWyBb+vzTjopMWyXJWhAaZQzbXzub6rPBTLHmKlboXf2a7R2RgwQSRN3PNbxuOtPEd2HfcFPXVkFRq9oPfWQUA3mQVXV5KvGyqK3K/j/PTRcjHly5OtQkVPIrPoOm/VJEtrHTYPhI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=27/dbg11; arc=none smtp.client-ip=209.85.210.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="27/dbg11" Received: by mail-pf1-f173.google.com with SMTP id d2e1a72fcca58-71e79f73aaeso3599876b3a.3 for ; Mon, 21 Oct 2024 19:12:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1729563130; x=1730167930; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Rg14Ct7WiGxTZjBJ8Oh4faNSA+nGmRSrSXC7j54ZInk=; b=27/dbg11k37O4hw597Wxl7Ug0xhIOTWEBi9jPP84HTQxB+JwEgQOSXFJ0oCjGuWe/m 6dhCVhndEBrb6ACHlrtKV/j2HmnssY+DsbrtrbI6GesqNGZqnQsMEKUoDfKm0agQVBLD vqfUPHEnQ0iaiGK445PIGJmnXAl40DiXpKe6CfXtlyardrcWDQXQ0d64rGei+dXA8VbE 48swAiW5HHZS2Zx3m9iwLYB4pq7AZW+gEAEZoJ2rvnH8OLK1c0aPwATyc6Al4qiHoUzn Q3BGHkRoAceNw/5hQEwu+K5utk80WIG+VIgRUiro1wgm2G5WaUh84640AvbgSbfbaUVS kOQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729563130; x=1730167930; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Rg14Ct7WiGxTZjBJ8Oh4faNSA+nGmRSrSXC7j54ZInk=; b=PWjBU3grp5JWRfHRk+ztrp0cLRFBlhPsaPICRkjxOPDoUj5EXPa7cmwUXuk4eu/HAR AiyvotTUJFJ5gFDvIT5JLTDdrGMVeMhAd6YVGrjW+zFkSe4cE7VbfUDObWrdkov/123I Uzm+Dj0GLDGCtAPRMjD6KAX9C/WCaX5+3H3/5yJP5kzHuRwjvsN2wyZfvrlAr0j9uSdw vHqXraV0Z/L1IdhB9gsmXxxuKiXGhMooP1IbP6vLTDHN0TEV4ov7GHs2Z//a09YlFUbV t/VXtdV/b5W5cGB/mC/3BjClq9jkN0znuWK9bFc+hCC6X4SYGU8ycD2eYw9X7d3Pm0HI R5yA== X-Gm-Message-State: AOJu0YxqHehK8As1mbwLd/J8340lOGo8Djphv8cGK1lAEH3dkHzgZJ9a EfQCtgFN+lI2hT3ZIEl0WJLmCddoHJLb/jZ/6wonC0HJccFqL/JeLp0if8VEIDGApljqBAwEZYJ K X-Google-Smtp-Source: AGHT+IHNEv3h3osUtHxKDjNedYthFBAB/6y0Bn3B+shh1EhI5de4WS0eyhRVmwmm8zewNkevW+PBIg== X-Received: by 2002:a05:6a21:1693:b0:1d8:f679:ee03 with SMTP id adf61e73a8af0-1d92c5100abmr19207305637.27.1729563129845; Mon, 21 Oct 2024 19:12:09 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-71ec131477asm3747060b3a.10.2024.10.21.19.12.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Oct 2024 19:12:08 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 3/3] io_uring/register: add IORING_REGISTER_RESIZE_RINGS Date: Mon, 21 Oct 2024 20:08:30 -0600 Message-ID: <20241022021159.820925-4-axboe@kernel.dk> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241022021159.820925-1-axboe@kernel.dk> References: <20241022021159.820925-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Once a ring has been created, the size of the CQ and SQ rings are fixed. Usually this isn't a problem on the SQ ring side, as it merely controls the available number of requests that can be submitted in a single system call, and there's rarely a need to change that. For the CQ ring, it's a different story. For most efficient use of io_uring, it's important that the CQ ring never overflows. This means that applications must size it for the worst case scenario, which can be wasteful. Add IORING_REGISTER_RESIZE_RINGS, which allows an application to resize the existing rings. It takes a struct io_uring_params argument, the same one which is used to setup the ring initially, and resizes rings according to the sizes given. Certain properties are always inherited from the original ring setup, like SQE128/CQE32 and other setup options. The implementation only allows flag associated with how the CQ ring is sized and clamped. Existing unconsumed SQE and CQE entries are copied as part of the process. Any register op holds ->uring_lock, which prevents new submissions, and the internal mapping holds the completion lock as well across moving CQ ring state. Signed-off-by: Jens Axboe --- include/uapi/linux/io_uring.h | 3 + io_uring/register.c | 161 ++++++++++++++++++++++++++++++++++ 2 files changed, 164 insertions(+) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 86cb385fe0b5..c4737892c7cd 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -615,6 +615,9 @@ enum io_uring_register_op { /* send MSG_RING without having a ring */ IORING_REGISTER_SEND_MSG_RING = 31, + /* resize CQ ring */ + IORING_REGISTER_RESIZE_RINGS = 33, + /* this goes last */ IORING_REGISTER_LAST, diff --git a/io_uring/register.c b/io_uring/register.c index 52b2f9b74af8..8dfe46a1cfe4 100644 --- a/io_uring/register.c +++ b/io_uring/register.c @@ -29,6 +29,7 @@ #include "napi.h" #include "eventfd.h" #include "msg_ring.h" +#include "memmap.h" #define IORING_MAX_RESTRICTIONS (IORING_RESTRICTION_LAST + \ IORING_REGISTER_LAST + IORING_OP_LAST) @@ -361,6 +362,160 @@ static int io_register_clock(struct io_ring_ctx *ctx, return 0; } +/* + * State to maintain until we can swap. Both new and old state, used for + * either mapping or freeing. + */ +struct io_ring_ctx_rings { + unsigned short n_ring_pages; + unsigned short n_sqe_pages; + struct page **ring_pages; + struct page **sqe_pages; + struct io_uring_sqe *sq_sqes; + struct io_rings *rings; +}; + +static void io_register_free_rings(struct io_uring_params *p, + struct io_ring_ctx_rings *r) +{ + if (!(p->flags & IORING_SETUP_NO_MMAP)) { + io_pages_unmap(r->rings, &r->ring_pages, &r->n_ring_pages, + true); + io_pages_unmap(r->sq_sqes, &r->sqe_pages, &r->n_sqe_pages, + true); + } else { + io_pages_free(&r->ring_pages, r->n_ring_pages); + io_pages_free(&r->sqe_pages, r->n_sqe_pages); + vunmap(r->rings); + vunmap(r->sq_sqes); + } +} + +#define swap_old(ctx, o, n, field) \ + do { \ + (o).field = (ctx)->field; \ + (ctx)->field = (n).field; \ + } while (0) + +#define RESIZE_FLAGS (IORING_SETUP_CQSIZE | IORING_SETUP_CLAMP) +#define COPY_FLAGS (IORING_SETUP_NO_SQARRAY | IORING_SETUP_SQE128 | \ + IORING_SETUP_CQE32 | IORING_SETUP_NO_MMAP) + +static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg) +{ + struct io_ring_ctx_rings o = { }, n = { }; + size_t size, sq_array_offset; + struct io_uring_params p; + unsigned i, tail; + void *ptr; + int ret; + + if (copy_from_user(&p, arg, sizeof(p))) + return -EFAULT; + if (p.flags & ~RESIZE_FLAGS) + return -EINVAL; + /* nothing to do */ + if (p.sq_entries == ctx->sq_entries && p.cq_entries == ctx->cq_entries) + return 0; + /* properties that are always inherited */ + p.flags |= (ctx->flags & COPY_FLAGS); + + ret = io_uring_fill_params(p.sq_entries, &p); + if (unlikely(ret)) + return ret; + + size = rings_size(p.flags, p.sq_entries, p.cq_entries, + &sq_array_offset); + if (size == SIZE_MAX) + return -EOVERFLOW; + + if (!(p.flags & IORING_SETUP_NO_MMAP)) + n.rings = io_pages_map(&n.ring_pages, &n.n_ring_pages, size); + else + n.rings = __io_uaddr_map(&n.ring_pages, &n.n_ring_pages, + p.cq_off.user_addr, size); + if (IS_ERR(n.rings)) + return PTR_ERR(n.rings); + + n.rings->sq_ring_mask = p.sq_entries - 1; + n.rings->cq_ring_mask = p.cq_entries - 1; + n.rings->sq_ring_entries = p.sq_entries; + n.rings->cq_ring_entries = p.cq_entries; + + if (copy_to_user(arg, &p, sizeof(p))) { + io_register_free_rings(&p, &n); + return -EFAULT; + } + + if (p.flags & IORING_SETUP_SQE128) + size = array_size(2 * sizeof(struct io_uring_sqe), p.sq_entries); + else + size = array_size(sizeof(struct io_uring_sqe), p.sq_entries); + if (size == SIZE_MAX) { + io_register_free_rings(&p, &n); + return -EOVERFLOW; + } + + if (!(p.flags & IORING_SETUP_NO_MMAP)) + ptr = io_pages_map(&n.sqe_pages, &n.n_sqe_pages, size); + else + ptr = __io_uaddr_map(&n.sqe_pages, &n.n_sqe_pages, + p.sq_off.user_addr, + size); + if (IS_ERR(ptr)) { + io_register_free_rings(&p, &n); + return PTR_ERR(ptr); + } + + /* now copy entries, if any */ + n.sq_sqes = ptr; + tail = ctx->rings->sq.tail; + for (i = ctx->rings->sq.head; i < tail; i++) { + unsigned src_head = i & (ctx->sq_entries - 1); + unsigned dst_head = i & n.rings->sq_ring_mask; + + n.sq_sqes[dst_head] = ctx->sq_sqes[src_head]; + } + n.rings->sq.head = ctx->rings->sq.head; + n.rings->sq.tail = ctx->rings->sq.tail; + + spin_lock(&ctx->completion_lock); + tail = ctx->rings->cq.tail; + for (i = ctx->rings->cq.head; i < tail; i++) { + unsigned src_head = i & (ctx->cq_entries - 1); + unsigned dst_head = i & n.rings->cq_ring_mask; + + n.rings->cqes[dst_head] = ctx->rings->cqes[src_head]; + } + n.rings->cq.head = ctx->rings->cq.head; + n.rings->cq.tail = ctx->rings->cq.tail; + /* invalidate cached cqe refill */ + ctx->cqe_cached = ctx->cqe_sentinel = NULL; + + n.rings->sq_dropped = ctx->rings->sq_dropped; + n.rings->sq_flags = ctx->rings->sq_flags; + n.rings->cq_flags = ctx->rings->cq_flags; + n.rings->cq_overflow = ctx->rings->cq_overflow; + + /* all done, store old pointers and assign new ones */ + if (!(ctx->flags & IORING_SETUP_NO_SQARRAY)) + ctx->sq_array = (u32 *)((char *)n.rings + sq_array_offset); + + ctx->sq_entries = p.sq_entries; + ctx->cq_entries = p.cq_entries; + + swap_old(ctx, o, n, rings); + swap_old(ctx, o, n, n_ring_pages); + swap_old(ctx, o, n, n_sqe_pages); + swap_old(ctx, o, n, ring_pages); + swap_old(ctx, o, n, sqe_pages); + swap_old(ctx, o, n, sq_sqes); + spin_unlock(&ctx->completion_lock); + + io_register_free_rings(&p, &o); + return 0; +} + static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode, void __user *arg, unsigned nr_args) __releases(ctx->uring_lock) @@ -549,6 +704,12 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode, break; ret = io_register_clone_buffers(ctx, arg); break; + case IORING_REGISTER_RESIZE_RINGS: + ret = -EINVAL; + if (!arg || nr_args != 1) + break; + ret = io_register_resize_rings(ctx, arg); + break; default: ret = -EINVAL; break;