From patchwork Thu Oct 31 01:44:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13857409 Received: from mail-pg1-f169.google.com (mail-pg1-f169.google.com [209.85.215.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9F66E347C7 for ; Thu, 31 Oct 2024 01:46:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730339197; cv=none; b=spg4FlSkgnbM9AOSHXYz5ss80v0hJyXrTolvpVIqOc27QkOi4sJxEfdUi2SjQ9JH2HeKoci2jQTyfBNjD4GHWt0eszgxAmyXs/iAlTVJ1O+vEKUGDDyBRujpKpULWEk6otvkmo2vdyLvh1bS6sCpoVimiUHa97wsx0HVQ0eqzVo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730339197; c=relaxed/simple; bh=PbH1j+fZaPdDtkkmbuVnZPa6xzqSUyInpSFBa9a3dEo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=j4JLyZ3pL2guQc881QBHIynwP1n7RXpT9neJu8vGBaBzFvZXEknvtV3YMrhdRGRLsmELdCV0npx/ihvRoxrzi1T1EYtd6V0cz6uv3WkW4g85jTf1NcuI8VSXaFbgk9FSJhb79WPsNbyURW4wA2INhYyUA/EiWOUuVD3ZpXxFmJw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=SYxke/hT; arc=none smtp.client-ip=209.85.215.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="SYxke/hT" Received: by mail-pg1-f169.google.com with SMTP id 41be03b00d2f7-7db54269325so370606a12.2 for ; Wed, 30 Oct 2024 18:46:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1730339194; x=1730943994; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jceATRRhfvFKaozUeKEnWqCrQKui6P7HwDJln3J5Tvs=; b=SYxke/hTS9iDxgqD6C1JANclYr4kmc/gnKqRG9bf/2Hrb2lIelSzyrQOmi61addv3F k8qOZelqbo9e95tb2IjxavGbGMS0+QX6XmgVgnTRpNBFv1Wk+6CsNhhk4PgHmMWcMl77 uwzfAHuIAbYo1i+wCKdVMhi0la6C7058DJwZiOV1l4ln9F6XqIB/sk+26fCzVBcNIKcE VY7/Dvk59iiZwf770FaSteCIfmeZvQOh08oXfAIEKdw2djouIRlW5YsChypjGC7wxlqz tKN+OgzubyIXshis+5WTjyAe+IY7Xlg6+Ij+e1ouWo5Y0b61C2MX0pFPiaRFGPN29cW9 H0xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730339194; x=1730943994; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jceATRRhfvFKaozUeKEnWqCrQKui6P7HwDJln3J5Tvs=; b=QyydR+5884XpQNQuyjeFoyeyAzxIgfg9LHxOhbCTX3NQ2BmrQjl3AUp7HPm6fEM+0m 61h8QAppp1vJAPC65L3dNkUwS2QH/sF3BplAS5qLTyhONL8vxF4u+eqmHuNXqZgQDP3I n0Xk1+4ulI1HdDJuqRGIYTJavk6YktjWH3ZpLwVv/ZnIRM9kCER7gsJWifPkr/eQ+3oA uayYD40nZWswyYC2hKe4+MTHafiZSC1DaQjFSemwqeRk/CFeL6S9Fs8P1VXmG2jhrczO HVLYQAZGZrZSylfZuwLATrO0poeSWt44zTPt8/Kok6tqF9HZX8U/+hIPNLcca+Jb0E2R o3lg== X-Gm-Message-State: AOJu0YzkEiST2QdThID6q/CP9WCWZ2W69549iAfQvqgMVkSE9UGP+l+a DzaZRLfmxK/k0+sBwA/zBN4AjvQmfkMYMtG7cm/pvkzVs19eNbhvryNO7RBCc9svsZOrS2G9JTm AgGk= X-Google-Smtp-Source: AGHT+IEM7PxCP1F+ukJ7LRONyp9RMG6/nFUvp3I7jy85gHxx+UM1ekwTFgxNiUTvtnaFGiu0WnXv7Q== X-Received: by 2002:a05:6a21:2d8c:b0:1d9:a785:6487 with SMTP id adf61e73a8af0-1d9a83ab032mr23620760637.1.1730339194428; Wed, 30 Oct 2024 18:46:34 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-720bc315aafsm285872b3a.197.2024.10.30.18.46.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Oct 2024 18:46:33 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 1/2] io_uring/rsrc: allow cloning at an offset Date: Wed, 30 Oct 2024 19:44:55 -0600 Message-ID: <20241031014629.206573-2-axboe@kernel.dk> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241031014629.206573-1-axboe@kernel.dk> References: <20241031014629.206573-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Right now buffer cloning is an all-or-nothing kind of thing - either the whole table is cloned from a source to a destination ring, or nothing at all. However, it's not always desired to clone the whole thing. Allow for the application to specify a source and destination offset, and a number of buffers to clone. If the destination offset is non-zero, then allocate sparse nodes upfront. Signed-off-by: Jens Axboe --- include/uapi/linux/io_uring.h | 5 ++++- io_uring/rsrc.c | 32 ++++++++++++++++++++++++++------ 2 files changed, 30 insertions(+), 7 deletions(-) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 024745283783..cc8dbe78c126 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -719,7 +719,10 @@ enum { struct io_uring_clone_buffers { __u32 src_fd; __u32 flags; - __u32 pad[6]; + __u32 src_off; + __u32 dst_off; + __u32 nr; + __u32 pad[3]; }; struct io_uring_buf { diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index af60d9f597be..d00870128bb9 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -924,10 +924,11 @@ int io_import_fixed(int ddir, struct iov_iter *iter, return 0; } -static int io_clone_buffers(struct io_ring_ctx *ctx, struct io_ring_ctx *src_ctx) +static int io_clone_buffers(struct io_ring_ctx *ctx, struct io_ring_ctx *src_ctx, + struct io_uring_clone_buffers *arg) { + int i, ret, nbufs, off, nr; struct io_rsrc_data data; - int i, ret, nbufs; /* * Drop our own lock here. We'll setup the data we need and reference @@ -940,11 +941,29 @@ static int io_clone_buffers(struct io_ring_ctx *ctx, struct io_ring_ctx *src_ctx nbufs = src_ctx->buf_table.nr; if (!nbufs) goto out_unlock; - ret = io_rsrc_data_alloc(&data, nbufs); + ret = -EINVAL; + if (!arg->nr) + arg->nr = nbufs; + else if (arg->nr > nbufs) + goto out_unlock; + ret = -EOVERFLOW; + if (check_add_overflow(arg->nr, arg->src_off, &off)) + goto out_unlock; + if (off > nbufs) + goto out_unlock; + if (check_add_overflow(arg->nr, arg->dst_off, &off)) + goto out_unlock; + ret = -EINVAL; + if (off > IORING_MAX_REG_BUFFERS) + goto out_unlock; + ret = io_rsrc_data_alloc(&data, off); if (ret) goto out_unlock; - for (i = 0; i < nbufs; i++) { + off = arg->dst_off; + i = arg->src_off; + nr = arg->nr; + while (nr--) { struct io_rsrc_node *dst_node, *src_node; src_node = io_rsrc_node_lookup(&src_ctx->buf_table, i); @@ -960,7 +979,8 @@ static int io_clone_buffers(struct io_ring_ctx *ctx, struct io_ring_ctx *src_ctx refcount_inc(&src_node->buf->refs); dst_node->buf = src_node->buf; } - data.nodes[i] = dst_node; + data.nodes[off++] = dst_node; + i++; } /* Have a ref on the bufs now, drop src lock and re-grab our own lock */ @@ -1015,7 +1035,7 @@ int io_register_clone_buffers(struct io_ring_ctx *ctx, void __user *arg) file = io_uring_register_get_file(buf.src_fd, registered_src); if (IS_ERR(file)) return PTR_ERR(file); - ret = io_clone_buffers(ctx, file->private_data); + ret = io_clone_buffers(ctx, file->private_data, &buf); if (!registered_src) fput(file); return ret; From patchwork Thu Oct 31 01:44:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 13857410 Received: from mail-pf1-f179.google.com (mail-pf1-f179.google.com [209.85.210.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1D9872E406 for ; Thu, 31 Oct 2024 01:46:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730339199; cv=none; b=V0yghVgjC5fvC6Wtg/9pLSPHYubMb06frlQ9IEInjIXb9k+pl4waDaOQcFotBVSLPllHvqG4npJ9axpV/GST2AZCMUgauezKQB4H8thvx6wcpYsSjssM+EtPlDwPIL9eCZVz8/0Q5hK2g4Qg+QkW4mu7Pw0KTsExr0NFa4lO7HA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730339199; c=relaxed/simple; bh=sg3VKsYKsMcKEPhJ6KP60n7sMwrFuR8vpAzlOn2TBTg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IQvqMQMjXLJRQsCVGdfyOoo6EC1zIZrnHm0Z2qMW1tPlx0XDMbTGj8ikEhWeKOySZ0JhO8lcnxsqRvlgQACt/Fowz0AhcPsfn922gH5AE0Yjn17XjG2s6BPo9IiG2gUkO4pi4pBee+G2PK2b3HoOT55u8xjgP637zxwi54oSMMc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk; spf=pass smtp.mailfrom=kernel.dk; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b=1XeceYge; arc=none smtp.client-ip=209.85.210.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=kernel.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20230601.gappssmtp.com header.i=@kernel-dk.20230601.gappssmtp.com header.b="1XeceYge" Received: by mail-pf1-f179.google.com with SMTP id d2e1a72fcca58-71e983487a1so335340b3a.2 for ; Wed, 30 Oct 2024 18:46:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1730339196; x=1730943996; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DnMm/gehdUoU0nc4Lg5FWge2zIvUQmjhMYJrom18ThY=; b=1XeceYgeBl3TlGmS38ojRu+hcP38XyP2z0tlrpMwjXTm8wO5nhTuGwxQCauveAkQnp OBvmisi3oZ5uONKNxpylH9gS7uB9gllke7fsXblvoanYqvTLhcrXssPeOrMsN1flihad hRrb7lZbQK6oZWURhsGsk7uDhzxzpvvxRnefkYQHxRdZTyoszzTAmLascahYAzartQ+W d4S2fuqH5ARJe8w7vhc1mfU9HsItkCUEyC4yBDywFsQMYi2BQFV+lEg2xd5WTlqJL819 No09XmAAtiI++QdmmolZ67mdbNx/kyEjdC9I6gEDlCR3bU6zsVYmBtnv5apF0OGIAJLk pz/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730339196; x=1730943996; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DnMm/gehdUoU0nc4Lg5FWge2zIvUQmjhMYJrom18ThY=; b=b/qAZ28GF2RtKsBXTEoSwCltbnEzeVfTCW6OxdTEXUwALjfDqGR8s2tnmhxIEZkOwQ bSsSo3WPGrtXhbJy1SUJXDVtNgML6X6adFyqVBhAluWsDqiQv28MeUsjSODF/hgV5MkQ AFGGZVFRwR5raMf+wR/jSIcB++BvS8RGsm9JxIBbRvu0bPPG9XWasTf7FnySYop4IF45 ZzV2xvnspCx2X3TnuoVEb+GnLr7A8RrqghRH43TLpe2DYmnE7kAj2+OrWcQan5KPgKOW xzn+2xGRJyPQBJ1BXbJVfxQU/bopec1BoKwuVVpMSG5szfvbAD7SXVKoxJMlu26kxmOY SPTw== X-Gm-Message-State: AOJu0YwOTRQsThc3Wl+x6BvI9yv+BzOzKlkne2IxphYG/sUygNXKsLK4 Fu3sfrg+NReiaCyr5EhC1J1w1K+V2i0ePaDVnraro7N/p1iWd88C8XxFTxKWwUPVgMkyOPpcBm4 CMTE= X-Google-Smtp-Source: AGHT+IEvTEg5LrntTAc/5ui+cGtM+oAECNm1hTvBqO5bJk8TG8LN8uesUXcv7NEeMBOy7p6SpE/R8A== X-Received: by 2002:a05:6a00:148f:b0:71e:4296:2e with SMTP id d2e1a72fcca58-720b9c942damr1343051b3a.11.1730339195888; Wed, 30 Oct 2024 18:46:35 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-720bc315aafsm285872b3a.197.2024.10.30.18.46.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Oct 2024 18:46:35 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 2/2] io_uring/rsrc: allow cloning with node replacements Date: Wed, 30 Oct 2024 19:44:56 -0600 Message-ID: <20241031014629.206573-3-axboe@kernel.dk> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241031014629.206573-1-axboe@kernel.dk> References: <20241031014629.206573-1-axboe@kernel.dk> Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Currently cloning a buffer table will fail if the destination already has a table. But it should be possible to use it to replace existing elements. Add a IORING_REGISTER_DST_REPLACE cloning flag, which if set, will allow the destination to already having a buffer table. If that is the case, then entries designated by offset + nr buffers will be replaced if they already exist. Note that it's allowed to use IORING_REGISTER_DST_REPLACE and not have an existing table, in which case it'll work just like not having the flag set and an empty table - it'll just assign the newly created table for that case. Signed-off-by: Jens Axboe --- include/uapi/linux/io_uring.h | 3 +- io_uring/rsrc.c | 66 +++++++++++++++++++++++++++-------- 2 files changed, 54 insertions(+), 15 deletions(-) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index cc8dbe78c126..ce58c4590de6 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -713,7 +713,8 @@ struct io_uring_clock_register { }; enum { - IORING_REGISTER_SRC_REGISTERED = 1, + IORING_REGISTER_SRC_REGISTERED = (1U << 0), + IORING_REGISTER_DST_REPLACE = (1U << 1), }; struct io_uring_clone_buffers { diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index d00870128bb9..673ff00da727 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -927,8 +927,40 @@ int io_import_fixed(int ddir, struct iov_iter *iter, static int io_clone_buffers(struct io_ring_ctx *ctx, struct io_ring_ctx *src_ctx, struct io_uring_clone_buffers *arg) { - int i, ret, nbufs, off, nr; struct io_rsrc_data data; + int i, ret, off, nr; + unsigned int nbufs; + + /* if offsets are given, must have nr specified too */ + if (!arg->nr && (arg->dst_off || arg->src_off)) + return -EINVAL; + /* not allowed unless REPLACE is set */ + if (ctx->buf_table.nr && !(arg->flags & IORING_REGISTER_DST_REPLACE)) + return -EBUSY; + + nbufs = READ_ONCE(src_ctx->buf_table.nr); + if (!arg->nr) + arg->nr = nbufs; + else if (arg->nr > nbufs) + return -EINVAL; + else if (arg->nr > IORING_MAX_REG_BUFFERS) + return -EINVAL; + if (check_add_overflow(arg->nr, arg->dst_off, &nbufs)) + return -EOVERFLOW; + + ret = io_rsrc_data_alloc(&data, max(nbufs, ctx->buf_table.nr)); + if (ret) + return ret; + + /* Fill entries in data from dst that won't overlap with src */ + for (i = 0; i < min(arg->dst_off, ctx->buf_table.nr); i++) { + struct io_rsrc_node *src_node = ctx->buf_table.nodes[i]; + + if (src_node) { + data.nodes[i] = src_node; + src_node->refs++; + } + } /* * Drop our own lock here. We'll setup the data we need and reference @@ -951,14 +983,6 @@ static int io_clone_buffers(struct io_ring_ctx *ctx, struct io_ring_ctx *src_ctx goto out_unlock; if (off > nbufs) goto out_unlock; - if (check_add_overflow(arg->nr, arg->dst_off, &off)) - goto out_unlock; - ret = -EINVAL; - if (off > IORING_MAX_REG_BUFFERS) - goto out_unlock; - ret = io_rsrc_data_alloc(&data, off); - if (ret) - goto out_unlock; off = arg->dst_off; i = arg->src_off; @@ -986,6 +1010,20 @@ static int io_clone_buffers(struct io_ring_ctx *ctx, struct io_ring_ctx *src_ctx /* Have a ref on the bufs now, drop src lock and re-grab our own lock */ mutex_unlock(&src_ctx->uring_lock); mutex_lock(&ctx->uring_lock); + + /* + * If asked for replace, put the old table. data->nodes[] holds both + * old and new nodes at this point. + */ + if (arg->flags & IORING_REGISTER_DST_REPLACE) + io_rsrc_data_free(&ctx->buf_table); + + /* + * ctx->buf_table should be empty now - either the contents are being + * replaced and we just freed the table, or someone raced setting up + * a buffer table while the clone was happening. If not empty, fall + * through to failure handling. + */ if (!ctx->buf_table.nr) { ctx->buf_table = data; return 0; @@ -995,14 +1033,14 @@ static int io_clone_buffers(struct io_ring_ctx *ctx, struct io_ring_ctx *src_ctx mutex_lock(&src_ctx->uring_lock); /* someone raced setting up buffers, dump ours */ ret = -EBUSY; - i = nbufs; out_put_free: + i = data.nr; while (i--) { io_buffer_unmap(src_ctx, data.nodes[i]); kfree(data.nodes[i]); } - io_rsrc_data_free(&data); out_unlock: + io_rsrc_data_free(&data); mutex_unlock(&src_ctx->uring_lock); mutex_lock(&ctx->uring_lock); return ret; @@ -1022,12 +1060,12 @@ int io_register_clone_buffers(struct io_ring_ctx *ctx, void __user *arg) struct file *file; int ret; - if (ctx->buf_table.nr) - return -EBUSY; if (copy_from_user(&buf, arg, sizeof(buf))) return -EFAULT; - if (buf.flags & ~IORING_REGISTER_SRC_REGISTERED) + if (buf.flags & ~(IORING_REGISTER_SRC_REGISTERED|IORING_REGISTER_DST_REPLACE)) return -EINVAL; + if (!(buf.flags & IORING_REGISTER_DST_REPLACE) && ctx->buf_table.nr) + return -EBUSY; if (memchr_inv(buf.pad, 0, sizeof(buf.pad))) return -EINVAL;