From patchwork Thu Nov 14 17:38:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13875521 Received: from mail-ej1-f50.google.com (mail-ej1-f50.google.com [209.85.218.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1F36318BC21 for ; Thu, 14 Nov 2024 17:38:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731605888; cv=none; b=WNRyaXe90UkNDE7e3SerAOminls8Wnsfk4hbYRGnuzQ7aSlBv2kLiiBSv7JtnWZ66v5ZaebDNgjV3jk9wQ232b0/5xymWSuOi2pzxqGDzTq4X0Kd9f3wfXhanQ8AFvYTgDaGcmQ0WQZJ69eWd8HfvgmZ3XV51wcooQSvRacn590= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731605888; c=relaxed/simple; bh=eGfmk5pHWRmz/5IEyopxX540NFO23oPagEaL0p8rDAI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=V0e0OrT7FlYSzDZqV5oD9UmKLetN776itlP+1XRV8l84h03LIUVTf8fo8RuMo8hdnHeGjiNqB4Pd6KAkk/qWtMBQ8EwU2JGwwnRfgl6+M3lU2tZ9yR7spA1Ae5lf5zRjpOUXAAgr9zhDqZppGVLGod2JM/5H6odJ+LsmctfKIv4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=G8cez6RP; arc=none smtp.client-ip=209.85.218.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="G8cez6RP" Received: by mail-ej1-f50.google.com with SMTP id a640c23a62f3a-a9ed7d8c86cso177264366b.2 for ; Thu, 14 Nov 2024 09:38:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731605885; x=1732210685; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LbuG+/DWjJeYlovcxNfbsJzRVRRM+t1Xz3YpJq8xpS0=; b=G8cez6RPIwAR8x7+nmHlAoSVGHR4QFYoEtCGc5tMUZyGW87TvWUGOdyqnb1+g7Y5+c QdaIVkAwP+yyXjfVLWDP6DjhIjXkih6NEHwZU4iKF5gX9jqmAsIuZxztNBq+xQI+qvXj FibzQ9WdSlreV7AMEVPSVQX4DN0xFDt6AZF/KAP9yjWcFS0DpYw3/2pBwAwtn0kNtf+1 rKR8fqQmdy8OGdxvgbpO/h2Cb6ZNhtYL6/iasOk0DN9gy/rWPnVNDqOsIQpHMa6aAn1o inE/1lwUP617mYnjPMHeLcdjAZYT4YAtpeISxyQdv27tHFf+AX2yiQSkEj8PSXmnPBW9 1a6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731605885; x=1732210685; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LbuG+/DWjJeYlovcxNfbsJzRVRRM+t1Xz3YpJq8xpS0=; b=hhSB/oJ/mlL1oDNaerEhf0B07U6qkLvLxdEkSJAKdNWs30FjLrWrqD6Wh18G1yxEht /K8lGShhCIXQOBjev/1sAqfXYfRy9zj4+ruFJtbHaO62IuzrndmIG0Gnw+/RW2Fkcx0k wGNZoxWQZc/AN+/NiGAEMePWBfs+1OfH6u3Jv2KJLIqbACs1GUshOp0agTLA4gj6Rsew xlTocVpLb7Vi4TeJ5n/J268RDx2YNn3WAAIiRs+YiAWkNa0SGwgSsJA9/3KxHdC/gQxP MbHFFG0bNWYw/5VpYeK9XIDF+rR0cped0lx9hMBNC425gEWV/+lTaZD0ALFIAUFk+08R J7/A== X-Gm-Message-State: AOJu0YxZA3yT6A2jj4JBq1CPc/2r9O9XA4hqGMZHzxfGZRVV0xE4TQxi gmj2nnwReZBnciJKop244z7UXI8KYniTtChj94TvrqUhJ/X+6kAZUJzqzA== X-Google-Smtp-Source: AGHT+IHPArKzr3fMaAJTnvZ+EUpv6cqA1R77i/bGHxZDxpxauFiGcxt6u943IhQp3U/HcDcffBOXIQ== X-Received: by 2002:a17:907:7b88:b0:a9e:b150:abea with SMTP id a640c23a62f3a-aa1f813b789mr584274866b.52.1731605885116; Thu, 14 Nov 2024 09:38:05 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa20df56b31sm85799966b.72.2024.11.14.09.38.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Nov 2024 09:38:04 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v2 1/6] io_uring: fortify io_pin_pages with a warning Date: Thu, 14 Nov 2024 17:38:31 +0000 Message-ID: X-Mailer: git-send-email 2.46.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 We're a bit too frivolous with types of nr_pages arguments, converting it to long and back to int, passing an unsigned int pointer as an int pointer and so on. Shouldn't cause any problem but should be carefully reviewed, but until then let's add a WARN_ON_ONCE check to be more confident callers don't pass poorely checked arguents. Signed-off-by: Pavel Begunkov --- io_uring/memmap.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/io_uring/memmap.c b/io_uring/memmap.c index 85c66fa54956..6ab59c60dfd0 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -140,6 +140,8 @@ struct page **io_pin_pages(unsigned long uaddr, unsigned long len, int *npages) nr_pages = end - start; if (WARN_ON_ONCE(!nr_pages)) return ERR_PTR(-EINVAL); + if (WARN_ON_ONCE(nr_pages > INT_MAX)) + return ERR_PTR(-EOVERFLOW); pages = kvmalloc_array(nr_pages, sizeof(struct page *), GFP_KERNEL); if (!pages) From patchwork Thu Nov 14 17:38:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13875522 Received: from mail-ed1-f43.google.com (mail-ed1-f43.google.com [209.85.208.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A55DD18BC21 for ; Thu, 14 Nov 2024 17:38:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731605891; cv=none; b=FQ0jTzWBR9eobTTIkNh2vl+0wrD0tG9Ni9JNhQBsp7P7mwt3i54Pah5psh3upwN4k5Pr+ZG/MpjZKlA6mDInCQklo36Jn7CFyO24wUSfIZjjLFKnSzt2dyqK1QE8uVYYmhRDO6a6ZamGFtaK/f+W8dWhBVUEnqkdGBM82ZqLlm4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731605891; c=relaxed/simple; bh=bOXuPylse+C0vi8bk1k20OU0yKwquVaExqg5AhfYLSY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LFM2Z4mWoMzsxNd/x5rHh7Nev5GELO2apTrFiQ5SabqyK4P/sZqtH1dbnJriuvUknOtGKaQYGUNiJpCq2pT1fM1mOIFkWX3+p91qavjiA8UYCJ2F+1Um7i/ogvyAT0cgCVZovMbapwdrT+NW8/SDu3vlNB8LvuNLRYM+zoTk/WI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=kjkW4Ew5; arc=none smtp.client-ip=209.85.208.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kjkW4Ew5" Received: by mail-ed1-f43.google.com with SMTP id 4fb4d7f45d1cf-5c96b2a10e1so1531149a12.2 for ; Thu, 14 Nov 2024 09:38:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731605886; x=1732210686; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Hwyltf5fgdrVd41AJQ/C7rIRuEDCxBHV6H77DqLRdb4=; b=kjkW4Ew57qQt1wa34buEqr00mf4vLxG2Az+cq/OBVHUrV3T3CzyufhjH7Jd+3b8Xsf IzDhlNd+AXfdNvqy+83wSd/4lOikoEUFh4xBsVn9Eu5EPHbW4LCsbPYja9loNyAHwH9S ConrBgiUk3vxah2WW17fNRO/93Ec4BPMNjyMRucc4hHIZHh53fTdnIaj5Og6a18c7b0+ u3kKWHiHFWYDQ2WAzhhSLwV6Zo64zv8Uw1jrtO2VPAX2sUfcre4fXAw8qu0OJUl5fsja WtmbfUf+PaIK0x/UiNZsxvAjseyFEZA0dOGMXVd+5qk2x0oWZvY/4ZMlKN7J7FlaTmoI 8VVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731605886; x=1732210686; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Hwyltf5fgdrVd41AJQ/C7rIRuEDCxBHV6H77DqLRdb4=; b=Yz3SXA5aJyF+8L8dV9nGZAKmzECBft4qPCrB43MAz6/GiYq1N0Kq5eBduNLz7/RhD6 j15B9E0uzizAQmCfcbNv6wn7bHvF+RzRFpWkyZPmhqdXr1tnCdFSilcKDhpV6KNWNOgw RG7+7y1KrsbHPfsojvp1XOZJ6EkKloatWUw+eFYKIn9KmohTOpo3WlhzbKW/4XkD7DaP ZghbetGjQKmB0OqJiz5gUAwX0zCsAXHt+AvgljLb/Y0yXgT5hCevGHHLDL2+8bHc8LtK 1YbO9yuYKr4Rh8TZ1GTF/AeYKO7kDU1cXYxBtHzabNaY3jPK/njabMbOcXTGuTy3S4ln KEnw== X-Gm-Message-State: AOJu0YwYgcats8TNVU5Rvn+cDdlzk2oHX/zhPUTLP3FmL3Gtku2wQqGX dxlbZu9vO776S30NRT/O1DtNjJaNP+EG+lzAmpWc7XVqkpG/AZwnxe0qEA== X-Google-Smtp-Source: AGHT+IFLrBnrhgupBSxAeeyTLlK6AzEX2PN7Q7KdVaFeLRU3EICLH6ubGg0wE/iSII3BQYMR/Lj0/w== X-Received: by 2002:a17:906:4f96:b0:a9f:168:efdf with SMTP id a640c23a62f3a-a9f0169008dmr1613752466b.6.1731605885712; Thu, 14 Nov 2024 09:38:05 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa20df56b31sm85799966b.72.2024.11.14.09.38.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Nov 2024 09:38:05 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v2 2/6] io_uring: disable ENTER_EXT_ARG_REG for IOPOLL Date: Thu, 14 Nov 2024 17:38:32 +0000 Message-ID: X-Mailer: git-send-email 2.46.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 IOPOLL doesn't use the extended arguments, no need for it to support IORING_ENTER_EXT_ARG_REG. Let's disable it for IOPOLL, if anything it leaves more space for future extensions. Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index bd71782057de..464a70bde7e6 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3214,12 +3214,8 @@ static int io_validate_ext_arg(struct io_ring_ctx *ctx, unsigned flags, if (!(flags & IORING_ENTER_EXT_ARG)) return 0; - - if (flags & IORING_ENTER_EXT_ARG_REG) { - if (argsz != sizeof(struct io_uring_reg_wait)) - return -EINVAL; - return PTR_ERR(io_get_ext_arg_reg(ctx, argp)); - } + if (flags & IORING_ENTER_EXT_ARG_REG) + return -EINVAL; if (argsz != sizeof(arg)) return -EINVAL; if (copy_from_user(&arg, argp, sizeof(arg))) From patchwork Thu Nov 14 17:38:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13875523 Received: from mail-ej1-f44.google.com (mail-ej1-f44.google.com [209.85.218.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DF73C18BC28 for ; Thu, 14 Nov 2024 17:38:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731605891; cv=none; b=QWMvSDs9N83FeWX5lkYI+aeulvvXTKBd4i5Vp6/k7G93gFmYXWUZKcTQPN1NOj0Muv3VwaQ1SGStOQ1Cw/bJY3MaBxazuTm4YDEQX8JbVZY/aJkdSabEdnz32UUGgRLuUkGVJviYLA8HeVnq8I7yB8Rd+MIXBobZyqlukCIe230= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731605891; c=relaxed/simple; bh=v2IbOLpYo3a22Mop7BMSABJ+5BM1F7i4KaPCY4tFSYM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nQYVfWoNxljbLflqCREVHXBYmsLAAd0GjJ2LyHZJgAYf3qFlVQl08LjX9xn59qx/ccgDjMLJSKXeSYuqDEX6hWgeFmMhpcPe3GCCqDF2XEV5eq66BHxA2oy0Gj2Zsp+O+M2qImltXQKD2k7Vk+b3R7vauLaCW2+ATzfIT1UU5p0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=KBLVJBtR; arc=none smtp.client-ip=209.85.218.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="KBLVJBtR" Received: by mail-ej1-f44.google.com with SMTP id a640c23a62f3a-a9ec267b879so165057266b.2 for ; Thu, 14 Nov 2024 09:38:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731605886; x=1732210686; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9nJxolNITZkKjcgVgsRnwP3J3cPf7OR1t7KjI8tbabc=; b=KBLVJBtRRc2VQrJvhs8TUkl8dSS93Ad/qkMvu643sd+xMpH9VD/eZf1VEEvGwvcmdY mTN8R49+vhJEdJPQKTcRpoLMWmzkWgEzM+JJX+OnrcstC+1KnGg+wl79RkWmmGlzrWIW WGpFDAMT0qNjKwj/EPXVzxXY5QlizAVFe3LX8A8qiBuzJXpX0KDg9OnRoKtjRMSNNhkw 9/g9ztS2MjsEGWA/PuS4U755KlCk1YCfSDPT4weHHkP5XJ1h07fzirNAHIOZX2+gdWep F00dLDnq+eH8GIZcjgPKxQPw2Hjz4V7fa73PwCEYWbM8S+EXa9R65UD6DPFVkWCO0FvO +AqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731605886; x=1732210686; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9nJxolNITZkKjcgVgsRnwP3J3cPf7OR1t7KjI8tbabc=; b=jwxOGRnbZKSxFKccJrLl0ZO4JXXDcgC0oD/NnqRNjxYDMW0mbRR8WhF7QVO68E71ao LMOH2pYuecAxuqlkXCnueLBhft5/vMWYxt5Ajs+JKlz6vsFapcYKrs4xdxTZHZIvenww RFwQgW3GxYvHRT7cuckOnZ8egOk/UgpGhnhikrxCvkhJlg41H+RTEW4mD16dXIdHmt8a +X2iv7azF26B3ws7dtnFegRehzQXQJy6hkV/JjLMzvuYmhFNBuS1c0ZalXAZueoEty/E Vot6XIKW3us72oyw8oOymw//+FjhcaNcfHsxFWKPpZwthcJCWAE1vD7gA0d03WhiPqfr tKnQ== X-Gm-Message-State: AOJu0Yx/alyRkpwsbgHjTRfEgsFEe9YpYjEUpt7edYX5wgYFc6FAPdev BAXGvXMCs5NNukp5FSuetSNqL1URT7yvQtV2+bEwJOalG3rHb/pIZRwCtg== X-Google-Smtp-Source: AGHT+IERY+QzSbcpM5GmaOyZFHNWlr8ws+IoI1CSVY0/DbwVuZkdlCRn7iUdXe1k0HMWpgRXugbdBQ== X-Received: by 2002:a17:906:4783:b0:a9a:533b:56e3 with SMTP id a640c23a62f3a-aa1f8076dacmr757675666b.26.1731605886358; Thu, 14 Nov 2024 09:38:06 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa20df56b31sm85799966b.72.2024.11.14.09.38.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Nov 2024 09:38:05 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v2 3/6] io_uring: temporarily disable registered waits Date: Thu, 14 Nov 2024 17:38:33 +0000 Message-ID: <70b1d1d218c41ba77a76d1789c8641dab0b0563e.1731604990.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Disable wait argument registration as it'll be replaced with a more generic feature. We'll still need IORING_ENTER_EXT_ARG_REG parsing in a few commits so leave it be. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 10 ----- include/uapi/linux/io_uring.h | 3 -- io_uring/io_uring.c | 10 ----- io_uring/register.c | 82 ---------------------------------- io_uring/register.h | 1 - 5 files changed, 106 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 072e65e93105..52a5da99a205 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -330,14 +330,6 @@ struct io_ring_ctx { atomic_t cq_wait_nr; atomic_t cq_timeouts; struct wait_queue_head cq_wait; - - /* - * If registered with IORING_REGISTER_CQWAIT_REG, a single - * page holds N entries, mapped in cq_wait_arg. cq_wait_index - * is the maximum allowable index. - */ - struct io_uring_reg_wait *cq_wait_arg; - unsigned char cq_wait_index; } ____cacheline_aligned_in_smp; /* timeouts */ @@ -431,8 +423,6 @@ struct io_ring_ctx { unsigned short n_sqe_pages; struct page **ring_pages; struct page **sqe_pages; - - struct page **cq_wait_page; }; struct io_tw_state { diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 5d08435b95a8..132f5db3d4e8 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -627,9 +627,6 @@ enum io_uring_register_op { /* resize CQ ring */ IORING_REGISTER_RESIZE_RINGS = 33, - /* register fixed io_uring_reg_wait arguments */ - IORING_REGISTER_CQWAIT_REG = 34, - /* this goes last */ IORING_REGISTER_LAST, diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 464a70bde7e6..286b7bb73978 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2709,7 +2709,6 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx) io_alloc_cache_free(&ctx->msg_cache, io_msg_cache_free); io_futex_cache_free(ctx); io_destroy_buffers(ctx); - io_unregister_cqwait_reg(ctx); mutex_unlock(&ctx->uring_lock); if (ctx->sq_creds) put_cred(ctx->sq_creds); @@ -3195,15 +3194,6 @@ void __io_uring_cancel(bool cancel_all) static struct io_uring_reg_wait *io_get_ext_arg_reg(struct io_ring_ctx *ctx, const struct io_uring_getevents_arg __user *uarg) { - struct io_uring_reg_wait *arg = READ_ONCE(ctx->cq_wait_arg); - - if (arg) { - unsigned int index = (unsigned int) (uintptr_t) uarg; - - if (index <= ctx->cq_wait_index) - return arg + index; - } - return ERR_PTR(-EFAULT); } diff --git a/io_uring/register.c b/io_uring/register.c index 45edfc57963a..3c5a3cfb186b 100644 --- a/io_uring/register.c +++ b/io_uring/register.c @@ -570,82 +570,6 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg) return ret; } -void io_unregister_cqwait_reg(struct io_ring_ctx *ctx) -{ - unsigned short npages = 1; - - if (!ctx->cq_wait_page) - return; - - io_pages_unmap(ctx->cq_wait_arg, &ctx->cq_wait_page, &npages, true); - ctx->cq_wait_arg = NULL; - if (ctx->user) - __io_unaccount_mem(ctx->user, 1); -} - -/* - * Register a page holding N entries of struct io_uring_reg_wait, which can - * be used via io_uring_enter(2) if IORING_GETEVENTS_EXT_ARG_REG is set. - * If that is set with IORING_GETEVENTS_EXT_ARG, then instead of passing - * in a pointer for a struct io_uring_getevents_arg, an index into this - * registered array is passed, avoiding two (arg + timeout) copies per - * invocation. - */ -static int io_register_cqwait_reg(struct io_ring_ctx *ctx, void __user *uarg) -{ - struct io_uring_cqwait_reg_arg arg; - struct io_uring_reg_wait *reg; - struct page **pages; - unsigned long len; - int nr_pages, poff; - int ret; - - if (ctx->cq_wait_page || ctx->cq_wait_arg) - return -EBUSY; - if (copy_from_user(&arg, uarg, sizeof(arg))) - return -EFAULT; - if (!arg.nr_entries || arg.flags) - return -EINVAL; - if (arg.struct_size != sizeof(*reg)) - return -EINVAL; - if (check_mul_overflow(arg.struct_size, arg.nr_entries, &len)) - return -EOVERFLOW; - if (len > PAGE_SIZE) - return -EINVAL; - /* offset + len must fit within a page, and must be reg_wait aligned */ - poff = arg.user_addr & ~PAGE_MASK; - if (len + poff > PAGE_SIZE) - return -EINVAL; - if (poff % arg.struct_size) - return -EINVAL; - - pages = io_pin_pages(arg.user_addr, len, &nr_pages); - if (IS_ERR(pages)) - return PTR_ERR(pages); - ret = -EINVAL; - if (nr_pages != 1) - goto out_free; - if (ctx->user) { - ret = __io_account_mem(ctx->user, 1); - if (ret) - goto out_free; - } - - reg = vmap(pages, 1, VM_MAP, PAGE_KERNEL); - if (reg) { - ctx->cq_wait_index = arg.nr_entries - 1; - WRITE_ONCE(ctx->cq_wait_page, pages); - WRITE_ONCE(ctx->cq_wait_arg, (void *) reg + poff); - return 0; - } - ret = -ENOMEM; - if (ctx->user) - __io_unaccount_mem(ctx->user, 1); -out_free: - io_pages_free(&pages, nr_pages); - return ret; -} - static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode, void __user *arg, unsigned nr_args) __releases(ctx->uring_lock) @@ -840,12 +764,6 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode, break; ret = io_register_resize_rings(ctx, arg); break; - case IORING_REGISTER_CQWAIT_REG: - ret = -EINVAL; - if (!arg || nr_args != 1) - break; - ret = io_register_cqwait_reg(ctx, arg); - break; default: ret = -EINVAL; break; diff --git a/io_uring/register.h b/io_uring/register.h index 3e935e8fa4b2..a5f39d5ef9e0 100644 --- a/io_uring/register.h +++ b/io_uring/register.h @@ -5,6 +5,5 @@ int io_eventfd_unregister(struct io_ring_ctx *ctx); int io_unregister_personality(struct io_ring_ctx *ctx, unsigned id); struct file *io_uring_register_get_file(unsigned int fd, bool registered); -void io_unregister_cqwait_reg(struct io_ring_ctx *ctx); #endif From patchwork Thu Nov 14 17:38:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13875524 Received: from mail-ej1-f42.google.com (mail-ej1-f42.google.com [209.85.218.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D9915262A3 for ; Thu, 14 Nov 2024 17:38:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731605892; cv=none; b=q32siBL1DNzDDgnyoqmb3ZcBVeXnCUmwZ12g0cDyYYkO1TXUVo69eiOAYVD9IekaC2Xwfunh49/mIRfZx0Lg+6GU9jXwAkqrg6d7TUvefxELBEu1DEzzqnpP2ihC0zeUuF+HBu3/eTvuUoyBENRvNFmFO4QqnCor2vOnFwhLEPM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731605892; c=relaxed/simple; bh=LEHVOqA/wu7bv9R8jRp4rql8nU3eWpbddqLIkmuBjZk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ipAu6v83nu5pKiv6aJhI9LAXi756xWUN+XRh56bMT3gYQu1GzGJu/nTkTR/RdDykQ2cJneeZWvBTrGwpcVbEFqHPvUJEh9+BlECxUQm+3nNYey9iZyhpJkwVZA0K8j7QT5jFRGUKtNjLluDXq4q+gLHDfIR10NjX5bkVN4riBVY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=jclxtTfh; arc=none smtp.client-ip=209.85.218.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jclxtTfh" Received: by mail-ej1-f42.google.com with SMTP id a640c23a62f3a-a9aa8895facso163605666b.2 for ; Thu, 14 Nov 2024 09:38:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731605889; x=1732210689; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=sbQAXiI9Dzu9Ee3eEwZ7D/5xWLvxvnaf/SprirDiKHE=; b=jclxtTfhy7vgww5NvZiq7rl5WUMhBM1OCD+4vfsxtqiTeWN+zWq1ljeplEsHBMTm0G 9U3lN1NGXFq/2lFNsMzwzCdDo6NLIL/PCokw2dNmqXho0+4Kqf6rHBKu0uatnCXBeehL czuwool33/4EWrerNfiLVOBw9hNIMZSPuSb1XKO6QXiaxTswWHPqvn9LzOFl3238CZkR lz1dmDUHCW9oItB0lqCF7mzbAWrdLXUCp9jIcfl9zSY/F8wyRE+api/UMjNpAaGRufRk /QWVEM/Tx7qQ3l2hfvPeE9xP3A5QVa6ZyU1aAtTZ7xMZ0dLRe6r2Aab41BzUFWmYUFyt oobw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731605889; x=1732210689; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sbQAXiI9Dzu9Ee3eEwZ7D/5xWLvxvnaf/SprirDiKHE=; b=gXhtAsrqAUmF6OSz+s1Yc7a2IO8rZi5QchT5RyE/MRanK6j0G+7UoblGqR6gzV0r1U TQ9mfHhqjiMd7WJvVyenoBpC8/ZhvodVUvradgjUcovjdq93fB9ZWJqlscevDVgohYKp YRbuGzvMqBd5JwEViOOqSaLSNPJRw6h8S9kpm9GZqA67/rGq+VJP2cP5Gs0x9TT8tEOU QfwWJMiILTMw6yA8+ikEM/Enmo5X34HNXPIHi+QilhpMz5EGuHTlJJqOUtVfnYSk3iLO wq77crzg/FgegtjHcEeGAHjS3xuvCeqIFpiMBnxFVYpvaRRP++psmatleFWcAIt3x0YH ny9Q== X-Gm-Message-State: AOJu0Yx9U13KH8q/iUqpaAYzGT3zQJSEeMzPc08wuRH9N3y4PhN0zwHG hyeKEFWWanU7JzheI0b3zWey4EDThazPUjDNxoCqzOKFJxeL0C80kvbzbA== X-Google-Smtp-Source: AGHT+IFTnlkZZxDCmqv7y9hd3PD8IiI5ITAKlqKPm4rwfnu7bXEOoC142XCQjaYr5uf8NI4BzOnR6Q== X-Received: by 2002:a17:907:2dab:b0:a99:f975:2e6 with SMTP id a640c23a62f3a-a9eeff3772amr2457757266b.35.1731605888823; Thu, 14 Nov 2024 09:38:08 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa20df56b31sm85799966b.72.2024.11.14.09.38.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Nov 2024 09:38:06 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v2 4/6] io_uring: introduce concept of memory regions Date: Thu, 14 Nov 2024 17:38:34 +0000 Message-ID: <069d94bca26aac066771574756ca007d0b68989a.1731604990.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 We've got a good number of mappings we share with the userspace, that includes the main rings, provided buffer rings, upcoming rings for zerocopy rx and more. All of them duplicate user argument parsing and some internal details as well (page pinnning, huge page optimisations, mmap'ing, etc.) Introduce a notion of regions. For userspace for now it's just a new structure called struct io_uring_region_desc which is supposed to parameterise all such mapping / queue creations. A region either represents a user provided chunk of memory, in which case the user_addr field should point to it, or a request for the kernel to allocate the memory, in which case the user would need to mmap it after using the offset returned in the mmap_offset field. With a uniform userspace API we can avoid additional boiler plate code and apply future optimisation to all of them at once. Internally, there is a new structure struct io_mapped_region holding all relevant runtime information and some helpers to work with it. This patch limits it to user provided regions. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 6 ++++ include/uapi/linux/io_uring.h | 14 ++++++++ io_uring/memmap.c | 65 ++++++++++++++++++++++++++++++++++ io_uring/memmap.h | 14 ++++++++ 4 files changed, 99 insertions(+) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 52a5da99a205..1d3a37234ace 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -75,6 +75,12 @@ struct io_hash_table { unsigned hash_bits; }; +struct io_mapped_region { + struct page **pages; + void *vmap_ptr; + size_t nr_pages; +}; + /* * Arbitrary limit, can be raised if need be */ diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 132f5db3d4e8..5cbfd330c688 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -647,6 +647,20 @@ struct io_uring_files_update { __aligned_u64 /* __s32 * */ fds; }; +enum { + /* initialise with user provided memory pointed by user_addr */ + IORING_MEM_REGION_TYPE_USER = 1, +}; + +struct io_uring_region_desc { + __u64 user_addr; + __u64 size; + __u32 flags; + __u32 id; + __u64 mmap_offset; + __u64 __resv[4]; +}; + /* * Register a fully sparse file space, rather than pass in an array of all * -1 file descriptors. diff --git a/io_uring/memmap.c b/io_uring/memmap.c index 6ab59c60dfd0..510c75b88a07 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -12,6 +12,7 @@ #include "memmap.h" #include "kbuf.h" +#include "rsrc.h" static void *io_mem_alloc_compound(struct page **pages, int nr_pages, size_t size, gfp_t gfp) @@ -194,6 +195,70 @@ void *__io_uaddr_map(struct page ***pages, unsigned short *npages, return ERR_PTR(-ENOMEM); } +void io_free_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr) +{ + if (mr->pages) + unpin_user_pages(mr->pages, mr->nr_pages); + if (mr->vmap_ptr) + vunmap(mr->vmap_ptr); + if (mr->nr_pages && ctx->user) + __io_unaccount_mem(ctx->user, mr->nr_pages); + + memset(mr, 0, sizeof(*mr)); +} + +int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, + struct io_uring_region_desc *reg) +{ + int pages_accounted = 0; + struct page **pages; + int nr_pages, ret; + void *vptr; + u64 end; + + if (WARN_ON_ONCE(mr->pages || mr->vmap_ptr || mr->nr_pages)) + return -EFAULT; + if (memchr_inv(®->__resv, 0, sizeof(reg->__resv))) + return -EINVAL; + if (reg->flags != IORING_MEM_REGION_TYPE_USER) + return -EINVAL; + if (!reg->user_addr) + return -EFAULT; + if (!reg->size || reg->mmap_offset || reg->id) + return -EINVAL; + if ((reg->size >> PAGE_SHIFT) > INT_MAX) + return E2BIG; + if ((reg->user_addr | reg->size) & ~PAGE_MASK) + return -EINVAL; + if (check_add_overflow(reg->user_addr, reg->size, &end)) + return -EOVERFLOW; + + pages = io_pin_pages(reg->user_addr, reg->size, &nr_pages); + if (IS_ERR(pages)) + return PTR_ERR(pages); + + if (ctx->user) { + ret = __io_account_mem(ctx->user, nr_pages); + if (ret) + goto out_free; + pages_accounted = nr_pages; + } + + vptr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL); + if (!vptr) + goto out_free; + + mr->pages = pages; + mr->vmap_ptr = vptr; + mr->nr_pages = nr_pages; + return 0; +out_free: + if (pages_accounted) + __io_unaccount_mem(ctx->user, pages_accounted); + io_pages_free(&pages, nr_pages); + return ret; +} + static void *io_uring_validate_mmap_request(struct file *file, loff_t pgoff, size_t sz) { diff --git a/io_uring/memmap.h b/io_uring/memmap.h index 5cec5b7ac49a..f361a635b6c7 100644 --- a/io_uring/memmap.h +++ b/io_uring/memmap.h @@ -22,4 +22,18 @@ unsigned long io_uring_get_unmapped_area(struct file *file, unsigned long addr, unsigned long flags); int io_uring_mmap(struct file *file, struct vm_area_struct *vma); +void io_free_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr); +int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, + struct io_uring_region_desc *reg); + +static inline void *io_region_get_ptr(struct io_mapped_region *mr) +{ + return mr->vmap_ptr; +} + +static inline bool io_region_is_set(struct io_mapped_region *mr) +{ + return !!mr->nr_pages; +} + #endif From patchwork Thu Nov 14 17:38:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13875525 Received: from mail-ed1-f48.google.com (mail-ed1-f48.google.com [209.85.208.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9798B18C32A for ; Thu, 14 Nov 2024 17:38:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731605894; cv=none; b=b+cdlLmzZReVHLuV0w9EngDbLKASpIbIl5w8AFUycW0Ql5gcM+HZNs+qNFMlQAQdFKwpsr9SrzPnuxfp0higP1plVx2lk5PIwBYRjC3PrISQVuEoB08NhTyqinS2h8k24Gh9GeskQ3mPc4NZEawNcUfpEf5c5NuXo7qMujdTk8k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731605894; c=relaxed/simple; bh=Hvv9oNYi9s1ueMiugtSN67O5L03GHYmR01AxOAkhpj4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iMDzyfbEwhedQ8PpPaisgWbqOKtDK5SS7ZbdcnSbvJRfIXIt21WlSMUO0cCDD/161qBwQi2okn6QJbdBuzk+OQDlLUPbtDuvvmVTgzpU5aOqxS17vN0VhHsHBauN6KcS3FGZbQ4hSZreiK44QDPAMI2wfYIHxNmPXgPAOltt1ok= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=PM4JUOTH; arc=none smtp.client-ip=209.85.208.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="PM4JUOTH" Received: by mail-ed1-f48.google.com with SMTP id 4fb4d7f45d1cf-5cf7b79c6a6so1122178a12.0 for ; Thu, 14 Nov 2024 09:38:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731605890; x=1732210690; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BBdhYIPS0vNHqswEqOk5yrUEAAdG6KLAA34YHKFu7Dg=; b=PM4JUOTH4sFleYMtA84JOS0LOFTib91SZkWqXFI6Ry3wO4UvkQvhSbf1kBZYXnaYbg hKd36PbN8k2+Y/8UZALJazLZd53LH0PKfnOBh5KabgbzAf68nzKcJM7B1OZ27MwKcP1d iesXZyqUof0B0PJTQRemgVW0Geq1w9WeOVPTmZ2Pz9bMwWxs1JmepPYBKiaBNIKTmfEL AnqIh4xS++TrLHskGVWGhEjsyfyWmFRO3yFQzLBx9CfJZVb7tYVQ9e1oI7TLoGDsWF76 Hv9R6RVoEQM6q7GhoqySFOCGGqVZcwC2xHnwi9oIF0/0mRixwO2Mc6+KOhXpVVxL7b53 iL1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731605890; x=1732210690; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BBdhYIPS0vNHqswEqOk5yrUEAAdG6KLAA34YHKFu7Dg=; b=q4YELnDCB9QSKit6dJ0zqPBuEoXoqYzKUZAO13Zve8Hfm3UPNvuvFiu2r47TXywCHQ OetbblF85VdjW0Fc1h9LxtPIyfigqZcwK/OIVuETTlPyPiB/AhPmqkQWuz1brd/ErlWo y96RdG7KoHnFBfdivJNI/Ga7wvfSM63jfrqO4mDrJi5XCX2sVbe4jpi9ueoXERh1hF6A gBAJoPssrNZwHRmz7imMvX0kIEIj21MM9XugBXbmzwi9vI/OwqeZwbbclV/9a92cgy0f hQKRWzI44lEtTrWIXPjD7LpjV/Nc1aKuUISHMKPryN2S1NP176ETZmlSaO1v8/9etCBM EIeQ== X-Gm-Message-State: AOJu0YwryM5cvhYHVCvBULu3+BkdUvmcV1wALbzeNqSw7UVkK2IgiS8J 3aiaGQXz4mwqcLc/zdyxpipgY4XOkfo/SVoYQ7oOzbPN/gKOoHVvhYwQgA== X-Google-Smtp-Source: AGHT+IHlYpQKVwJlFOeV6I9fXUDp1PaHkJH3V73OPfYVXb//NOvKJsSY2ncAQ4A9Ra7OxXJJypH9JQ== X-Received: by 2002:a17:906:fe4b:b0:a93:a664:a23f with SMTP id a640c23a62f3a-aa20768144dmr453771566b.5.1731605889772; Thu, 14 Nov 2024 09:38:09 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa20df56b31sm85799966b.72.2024.11.14.09.38.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Nov 2024 09:38:09 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v2 5/6] io_uring: add memory region registration Date: Thu, 14 Nov 2024 17:38:35 +0000 Message-ID: <1bd8b8abc945bebda2b465b54218be94a2f93d85.1731604990.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Regions will serve multiple purposes. First, with it we can decouple ring/etc. object creation from registration / mapping of the memory they will be placed in. We already have hacks that allow to put both SQ and CQ into the same huge page, in the future we should be able to: region = create_region(io_ring); create_pbuf_ring(io_uring, region, offset=0); create_pbuf_ring(io_uring, region, offset=N); The second use case is efficiently passing parameters. The following patch enables back on top of regions IORING_ENTER_EXT_ARG_REG, which optimises wait arguments. It'll also be useful for request arguments replacing iovecs, msghdr, etc. pointers. Eventually it would also be handy for BPF as well if it comes to fruition. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 3 +++ include/uapi/linux/io_uring.h | 8 ++++++++ io_uring/io_uring.c | 1 + io_uring/register.c | 37 ++++++++++++++++++++++++++++++++++ 4 files changed, 49 insertions(+) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 1d3a37234ace..e1d69123e164 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -429,6 +429,9 @@ struct io_ring_ctx { unsigned short n_sqe_pages; struct page **ring_pages; struct page **sqe_pages; + + /* used for optimised request parameter and wait argument passing */ + struct io_mapped_region param_region; }; struct io_tw_state { diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 5cbfd330c688..1ee35890125b 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -627,6 +627,8 @@ enum io_uring_register_op { /* resize CQ ring */ IORING_REGISTER_RESIZE_RINGS = 33, + IORING_REGISTER_MEM_REGION = 34, + /* this goes last */ IORING_REGISTER_LAST, @@ -661,6 +663,12 @@ struct io_uring_region_desc { __u64 __resv[4]; }; +struct io_uring_mem_region_reg { + __u64 region_uptr; /* struct io_uring_region_desc * */ + __u64 flags; + __u64 __resv[2]; +}; + /* * Register a fully sparse file space, rather than pass in an array of all * -1 file descriptors. diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 286b7bb73978..c640b8a4ceee 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2709,6 +2709,7 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx) io_alloc_cache_free(&ctx->msg_cache, io_msg_cache_free); io_futex_cache_free(ctx); io_destroy_buffers(ctx); + io_free_region(ctx, &ctx->param_region); mutex_unlock(&ctx->uring_lock); if (ctx->sq_creds) put_cred(ctx->sq_creds); diff --git a/io_uring/register.c b/io_uring/register.c index 3c5a3cfb186b..2cbac3d9b288 100644 --- a/io_uring/register.c +++ b/io_uring/register.c @@ -570,6 +570,37 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg) return ret; } +static int io_register_mem_region(struct io_ring_ctx *ctx, void __user *uarg) +{ + struct io_uring_mem_region_reg __user *reg_uptr = uarg; + struct io_uring_mem_region_reg reg; + struct io_uring_region_desc __user *rd_uptr; + struct io_uring_region_desc rd; + int ret; + + if (io_region_is_set(&ctx->param_region)) + return -EBUSY; + if (copy_from_user(®, reg_uptr, sizeof(reg))) + return -EFAULT; + rd_uptr = u64_to_user_ptr(reg.region_uptr); + if (copy_from_user(&rd, rd_uptr, sizeof(rd))) + return -EFAULT; + + if (memchr_inv(®.__resv, 0, sizeof(reg.__resv))) + return -EINVAL; + if (reg.flags) + return -EINVAL; + + ret = io_create_region(ctx, &ctx->param_region, &rd); + if (ret) + return ret; + if (copy_to_user(rd_uptr, &rd, sizeof(rd))) { + io_free_region(ctx, &ctx->param_region); + return -EFAULT; + } + return 0; +} + static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode, void __user *arg, unsigned nr_args) __releases(ctx->uring_lock) @@ -764,6 +795,12 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode, break; ret = io_register_resize_rings(ctx, arg); break; + case IORING_REGISTER_MEM_REGION: + ret = -EINVAL; + if (!arg || nr_args != 1) + break; + ret = io_register_mem_region(ctx, arg); + break; default: ret = -EINVAL; break; From patchwork Thu Nov 14 17:38:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13875526 Received: from mail-ed1-f49.google.com (mail-ed1-f49.google.com [209.85.208.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 22503262A3 for ; Thu, 14 Nov 2024 17:38:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731605894; cv=none; b=cuXR25VVOW/n/4iwTq7e8hlb5cpgRXlDRlnkh0c0PxCr9awxbCAM9XNiNelyjt5v6zwJCB+XrM5K9lO12/+vH72Oe61EUGi0CSygRc2ZaG8fdB/zB804BGEMzJ7HTA+ORBVD0ym122kxkVlejOJWWKxyIkmnGJFTurKUgzSs6Jc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731605894; c=relaxed/simple; bh=LNgFjHLiBir6oEIW8ncov4hMQdVBoAL03Xq3o0BzmJw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iKhrvTEMkveLVanEgb+SwDgLhDf4AbInW4ddEK/FdIsIfwwY0SmycGDNfRen66CszE1h8iNZEiLYwXgDHK8IWgYlKYIkTTqe3Ooy26sBDjqeJWaPaVLmIAhfoIO3nl7CEXkQKjqbCdw0JTdWzKHEO2+h00orfo3SGqtnvTxRtTw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=d4Kubfcp; arc=none smtp.client-ip=209.85.208.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="d4Kubfcp" Received: by mail-ed1-f49.google.com with SMTP id 4fb4d7f45d1cf-5cb6ca2a776so1534963a12.0 for ; Thu, 14 Nov 2024 09:38:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731605891; x=1732210691; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rK2Es19MgdCzXZpmiUCx1KpjgTYixs0FFvqPiUOwGyE=; b=d4KubfcpiPd/ANUeD/ru51MxbQyAaTm61ckC9u8CzpVZ2uzB5cOIBSAPBODasqwoIF 75BoyGL8Yd5dcZERUejnAK/r5e1iUJKRvyo2tUDNXzXn0dtB9QoYZ44/uMX/TEb6fSuC lD3zWucwzh+wvi1OMjp1L/xTQLgsGu9GdRU/8OxQ9RAA4JtjWeTD4UBTowZP+6m0AD+N nVYUIyUlJUim2pXQ/6dlR4I9lHVetdaCZD4pW+UCFryWyXbh4mEAuk0aoybb8YjtWibf AfKpraHseS2JPo5oMHZqa/7+Ue/fEX4Ir3EWwxKTY89ZcneWDrZmUo4b34bav+9WpW09 E89w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731605891; x=1732210691; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rK2Es19MgdCzXZpmiUCx1KpjgTYixs0FFvqPiUOwGyE=; b=mHjlPfrbKg/McUGfbpLLhwRgxeXOSNclzonjUG4HYgj21dVa/pKRHfF9uwqgv5yzgB cLs+zliwkfNFphK0ypfSApXhmRK10D8yf1JExjdKqM0TuBGjrtsi0PclXXm2vwVpCLFU colCWC9pvqIwj9Msb8zmGTx+fnb/AfLhEPwgv32p91eKgBkyN3Gv/xrgcZb1OB/f/g03 Sf7qMSLjH+E8rN/0y2uD4i7npPiNBXPtC2IIY1n9FiFcHjnlbZlPe0Jq6EPx97yOfC2B o+loTbz0CIx50yuEk2GbIjLNXHBfcqpGlqviGW4YLOXZKwDcH/yd7QBMpxJUEQlf6yDF LgHQ== X-Gm-Message-State: AOJu0Yyv10pJ3W7EdvZOEeEq4i9S59Ra6znPeLFLr4PrLdiwcTV1DCRE 9bwSRSdwyVeRH/OtIibqHi85KT3ntAa7O7urpM1/bJgsGz6AOUHVymBBRA== X-Google-Smtp-Source: AGHT+IGdXbqRjcDjqcCMd7GwWsdcPsPqhg28WyCNh5h6lv9ND4VBLteor9VFGPe8msOiJyzFAMzWMQ== X-Received: by 2002:a17:907:c12:b0:a9a:76d:e86c with SMTP id a640c23a62f3a-aa1c57ef2c1mr1203170366b.49.1731605890465; Thu, 14 Nov 2024 09:38:10 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa20df56b31sm85799966b.72.2024.11.14.09.38.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Nov 2024 09:38:10 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v2 6/6] io_uring: restore back registered wait arguments Date: Thu, 14 Nov 2024 17:38:36 +0000 Message-ID: <24cce6841e4d5ebb3a33bb602a94f5ded77757c3.1731604990.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Now we've got a more generic region registration API, place IORING_ENTER_EXT_ARG_REG and re-enable it. First, the user has to register a region with the IORING_MEM_REGION_REG_WAIT_ARG flag set. It can only be done for a ring in a disabled state, aka IORING_SETUP_R_DISABLED, to avoid races with already running waiters. The other API difference is that we're now passing byte offsets instead of indexes. The user _must_ align all offsets / pointers to the native word size, failing to do so might but not necessarily has to lead to a failure usually returned as -EFAULT. liburing will be hiding this details from users. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 3 +++ include/uapi/linux/io_uring.h | 5 +++++ io_uring/io_uring.c | 14 +++++++++++++- io_uring/register.c | 16 +++++++++++++++- 4 files changed, 36 insertions(+), 2 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index e1d69123e164..aa5f5ea98076 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -324,6 +324,9 @@ struct io_ring_ctx { unsigned cq_entries; struct io_ev_fd __rcu *io_ev_fd; unsigned cq_extra; + + void *cq_wait_arg; + size_t cq_wait_size; } ____cacheline_aligned_in_smp; /* diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 1ee35890125b..4418d0192959 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -663,6 +663,11 @@ struct io_uring_region_desc { __u64 __resv[4]; }; +enum { + /* expose the region as registered wait arguments */ + IORING_MEM_REGION_REG_WAIT_ARG = 1, +}; + struct io_uring_mem_region_reg { __u64 region_uptr; /* struct io_uring_region_desc * */ __u64 flags; diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index c640b8a4ceee..c93a6a9cd47e 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3195,7 +3195,19 @@ void __io_uring_cancel(bool cancel_all) static struct io_uring_reg_wait *io_get_ext_arg_reg(struct io_ring_ctx *ctx, const struct io_uring_getevents_arg __user *uarg) { - return ERR_PTR(-EFAULT); + unsigned long size = sizeof(struct io_uring_reg_wait); + unsigned long offset = (uintptr_t)uarg; + unsigned long end; + + if (unlikely(offset % sizeof(long))) + return ERR_PTR(-EFAULT); + + /* also protects from NULL ->cq_wait_arg as the size would be 0 */ + if (unlikely(check_add_overflow(offset, size, &end) || + end >= ctx->cq_wait_size)) + return ERR_PTR(-EFAULT); + + return ctx->cq_wait_arg + offset; } static int io_validate_ext_arg(struct io_ring_ctx *ctx, unsigned flags, diff --git a/io_uring/register.c b/io_uring/register.c index 2cbac3d9b288..1a60f4916649 100644 --- a/io_uring/register.c +++ b/io_uring/register.c @@ -588,7 +588,16 @@ static int io_register_mem_region(struct io_ring_ctx *ctx, void __user *uarg) if (memchr_inv(®.__resv, 0, sizeof(reg.__resv))) return -EINVAL; - if (reg.flags) + if (reg.flags & ~IORING_MEM_REGION_REG_WAIT_ARG) + return -EINVAL; + + /* + * This ensures there are no waiters. Waiters are unlocked and it's + * hard to synchronise with them, especially if we need to initialise + * the region. + */ + if ((reg.flags & IORING_MEM_REGION_REG_WAIT_ARG) && + !(ctx->flags & IORING_SETUP_R_DISABLED)) return -EINVAL; ret = io_create_region(ctx, &ctx->param_region, &rd); @@ -598,6 +607,11 @@ static int io_register_mem_region(struct io_ring_ctx *ctx, void __user *uarg) io_free_region(ctx, &ctx->param_region); return -EFAULT; } + + if (reg.flags & IORING_MEM_REGION_REG_WAIT_ARG) { + ctx->cq_wait_arg = io_region_get_ptr(&ctx->param_region); + ctx->cq_wait_size = rd.size; + } return 0; }