From patchwork Fri Nov 15 16:54:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13876638 Received: from mail-wm1-f45.google.com (mail-wm1-f45.google.com [209.85.128.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 51FED1D6193 for ; Fri, 15 Nov 2024 16:54:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731689642; cv=none; b=aa8HMFU+GinmHtfXDUmudIVQMNFr5i9Lj34Xz3iZbeO3he+OXC6PpyfwZ4Q6vInazNKm3xMK7rt7OLgWNUZ1ZqBLLUweWymtKtR/74xZC2uXzdz25x4Pn4LaYbjd2Qs1U0r4j40C8h6QX2G9VnNBdgBmQVbuSkCrbezJtGvq0cs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731689642; c=relaxed/simple; bh=eGfmk5pHWRmz/5IEyopxX540NFO23oPagEaL0p8rDAI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IZ4nAAqhJ+TxLFJUVlTcE1/F3XpvO8nctuQ6uwu4Cmrw3Ciqbb1ZHv0kqrnyKWAMVRc8bfiXo5/gOEzDw5AuY539CNha8vTmI2paXMRd3Qc+PXUOqYwexOrMAoBZrUifeB+k6A4AmuDIyhP0cm1mC5+A7bCEBD3RNjT53ZmpuMY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=XFshcYp1; arc=none smtp.client-ip=209.85.128.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="XFshcYp1" Received: by mail-wm1-f45.google.com with SMTP id 5b1f17b1804b1-43152b79d25so7625685e9.1 for ; Fri, 15 Nov 2024 08:54:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731689638; x=1732294438; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LbuG+/DWjJeYlovcxNfbsJzRVRRM+t1Xz3YpJq8xpS0=; b=XFshcYp14LWrKv0blDx2iz22niZjbVcSHkZIY7vnaoOMTsiidoqDt6Iw+FsdnbLioU gF1LB6W+SYOAoBKqBeyoWvyq/wnBAGDyRqn1Ni/8jZg41RWSP5Po1s76o+gvVdhZQ+ny UKXDvs2+v1T1coqs71T3MuWbOVdHtY2ried69hMpwNzi+XUbi9fsktcuVuQTPeNzzYw7 GO48lFOR5xgoFbOONv0nVNsxhLxzOMHk4vSfKAR5WDEcUE4IGbiJotZMk6cMZGLCcXt3 v4oTDb166hXnB+pYjJjKqendtAqg/FSt2iNriZzmCKkyW5gCOQfHh6troJ8JniF9Xvca t/4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731689638; x=1732294438; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LbuG+/DWjJeYlovcxNfbsJzRVRRM+t1Xz3YpJq8xpS0=; b=S8o2ihfG+XZocGHZr9JKqvJE2NmcNB55aDOUe++ucqZ60o4GWl9wpACXPUSPgM4KTw FBL67qaIXcXfetI/R31EzN1WQRG78p1FCB3321gxhjLewD9JxKBr630d7DSQgKGyHpZ1 YvBQqN/xQI56BHn393qalIC7wT26TMB6LhmXDXvfoCZ3j2kH59Zz5607ucTL3zYmjD+T Y0HXgnKpcnmnT37HsnJg+49BsmOiQUKndJoFNQ8aAJC7nvYAY9q4a3AeO6BjSeBi3KLW 3s+xZn5u71htNo0ypoiPBHOoldhFZ4nfpMevL+RENxEOo2XzD0/4hANhMdG2u9K78JE9 KGXw== X-Gm-Message-State: AOJu0Ywng52HzXy/67QLh7wbPWQM+HwS7YszVtccDsDJxDou6O+AWYqB QvBgl7WNZS183u4EwnMoO4A7Ilc/wcNmZcsFstwK+PTCyPQbdaGZqqn6eQ== X-Google-Smtp-Source: AGHT+IEJxaVtJYtyJ0rRdBA+3ko+Tv/ERgTAwAYV3iLFWdiDtvdJIlIEfvII4r4pzj7HXM4Sabrczw== X-Received: by 2002:a05:6000:400c:b0:382:2ba9:9d53 with SMTP id ffacd0b85a97d-3822ba9a365mr1252396f8f.16.1731689638208; Fri, 15 Nov 2024 08:53:58 -0800 (PST) Received: from 127.0.0.1localhost ([148.252.132.111]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3821ae2f651sm5011895f8f.87.2024.11.15.08.53.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Nov 2024 08:53:56 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 1/6] io_uring: fortify io_pin_pages with a warning Date: Fri, 15 Nov 2024 16:54:38 +0000 Message-ID: X-Mailer: git-send-email 2.46.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 We're a bit too frivolous with types of nr_pages arguments, converting it to long and back to int, passing an unsigned int pointer as an int pointer and so on. Shouldn't cause any problem but should be carefully reviewed, but until then let's add a WARN_ON_ONCE check to be more confident callers don't pass poorely checked arguents. Signed-off-by: Pavel Begunkov --- io_uring/memmap.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/io_uring/memmap.c b/io_uring/memmap.c index 85c66fa54956..6ab59c60dfd0 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -140,6 +140,8 @@ struct page **io_pin_pages(unsigned long uaddr, unsigned long len, int *npages) nr_pages = end - start; if (WARN_ON_ONCE(!nr_pages)) return ERR_PTR(-EINVAL); + if (WARN_ON_ONCE(nr_pages > INT_MAX)) + return ERR_PTR(-EOVERFLOW); pages = kvmalloc_array(nr_pages, sizeof(struct page *), GFP_KERNEL); if (!pages) From patchwork Fri Nov 15 16:54:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13876639 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5678E1D63E9 for ; Fri, 15 Nov 2024 16:54:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731689644; cv=none; b=N8c5AMmrQnyu7VMFYnfYeahSBlq/8n5r95csQVaZIWn8IqbPCBBYotIjYO0gmR6akO1+qmmcMx8mujAkwITQxNa9/B6nJqAOVaN2nmngWwlPq8shR3yHD7TF6mjGEKyiYlOQVN5R3o5Eblmbgf129zdR7tWM7ppMDd5ukj7Rax4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731689644; c=relaxed/simple; bh=bOXuPylse+C0vi8bk1k20OU0yKwquVaExqg5AhfYLSY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=on/zeXmfyvmb4WbMHxPH2W6s0K3T6q/ctFspR/2iFn0SVkm2/CyClgzbkctqwYpO82+U+0445nPUmUgfENYiaojOqXCYUGBLsTGu2FPJGFfWqSS9GWDjEDctl2FlAVIZy1o36cHWZnkN5jnSJ0helH4d27Ml4mP4oX80BV8myys= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=KZXzNbRy; arc=none smtp.client-ip=209.85.221.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="KZXzNbRy" Received: by mail-wr1-f49.google.com with SMTP id ffacd0b85a97d-38224150a84so754638f8f.3 for ; Fri, 15 Nov 2024 08:54:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731689639; x=1732294439; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Hwyltf5fgdrVd41AJQ/C7rIRuEDCxBHV6H77DqLRdb4=; b=KZXzNbRy1aVv2k3pZXJmAm0h6RVFQryRlwiBJDYdKaqz9HHm/thVFBIM0yJPwE+pc0 0zlYDpvtAVpGLW7qD+JBQEH96ZCHyvwDdFhVdEEJ7K7/0QobS+Yg3pCpszuFvYISCw6L wrhmAej3BJiuW/NxLG4F/4mJhkAUnAjsZXMwp3jypROR8NXKvHRF++egRjLhCiBMSmYc kCKisernwLcQbAqYdJv7GfmZZfLZCjReO3fybzyDv1JJRo5WccYqF57KsIwAC1+fuEX3 wYkQ2kF5ybkzF4Td5233lCvDxQ5YhisMGnDdBNXJWhvHr/0payLOQoFtvfCg3SSPcMwd umaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731689639; x=1732294439; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Hwyltf5fgdrVd41AJQ/C7rIRuEDCxBHV6H77DqLRdb4=; b=YpoGiPH1zUZgv/lt5Os8StO+xqy6DPaui6bDp8/oZN6GHqU8lpO7JMwqzNB9H3p4gF i0IjBzH7mx+P7pY+RvTY1OHqXqJKi28WgDnb18afxv1K/xgHi6ZqhGu/tyS1dP1b8sCC ZpOjfRr2WBNrKlrNehORiGmmJO4zirAPLnzQ/8XdU7aDppVm1GG+tmWWlZC/IHcCSORu Afq80Zs+A6VeVSWlFtTuNcUU9KPSI+8DSCxqTAt26WAmZMmLFGXMNOt7kLqbymmmyqlB OFM3gwr65jq0UQEWLRkn5Dx0AygVBdXEsTp32q2W9KaIwudnuv/XHx4O/hcHaSXjvFZh zAEg== X-Gm-Message-State: AOJu0Yy8xAX0Awm3HxLcvdMJ2SJn3f9ORc+69KBaCMlzkrckhukQmDX4 6qC077cEqfVidP37MGTIckMCakV/WbFwAu06ZNjcZa2samEOicqc7YzVbQ== X-Google-Smtp-Source: AGHT+IHhc6W2CBpBVUaNB+nRWT4f1P9CdoYttfXdopSeDREMkkE+HLarg3fNyxnQPY5sSCCJDHazAA== X-Received: by 2002:a05:6000:a0c:b0:382:1ade:83ee with SMTP id ffacd0b85a97d-3822590646emr3010701f8f.23.1731689639428; Fri, 15 Nov 2024 08:53:59 -0800 (PST) Received: from 127.0.0.1localhost ([148.252.132.111]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3821ae2f651sm5011895f8f.87.2024.11.15.08.53.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Nov 2024 08:53:59 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 2/6] io_uring: disable ENTER_EXT_ARG_REG for IOPOLL Date: Fri, 15 Nov 2024 16:54:39 +0000 Message-ID: X-Mailer: git-send-email 2.46.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 IOPOLL doesn't use the extended arguments, no need for it to support IORING_ENTER_EXT_ARG_REG. Let's disable it for IOPOLL, if anything it leaves more space for future extensions. Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index bd71782057de..464a70bde7e6 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3214,12 +3214,8 @@ static int io_validate_ext_arg(struct io_ring_ctx *ctx, unsigned flags, if (!(flags & IORING_ENTER_EXT_ARG)) return 0; - - if (flags & IORING_ENTER_EXT_ARG_REG) { - if (argsz != sizeof(struct io_uring_reg_wait)) - return -EINVAL; - return PTR_ERR(io_get_ext_arg_reg(ctx, argp)); - } + if (flags & IORING_ENTER_EXT_ARG_REG) + return -EINVAL; if (argsz != sizeof(arg)) return -EINVAL; if (copy_from_user(&arg, argp, sizeof(arg))) From patchwork Fri Nov 15 16:54:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13876640 Received: from mail-wr1-f42.google.com (mail-wr1-f42.google.com [209.85.221.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 07A4E1D54D6 for ; Fri, 15 Nov 2024 16:54:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731689647; cv=none; b=tTNi3oPkeetzx+71gfm43TzM/wqaJvBX3E2XJD2teYB65kRBbaDueFOLaq9QqPbgIrE+PvyNJNToqwzgcSIIz8ZVxtgHcbJGvvKRqp9Il3G6V4pu9maqiWCA2+y//v7zSn4BbwEYo6FDusw2Lws1jshpZbQviS2kpPHshnADNn4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731689647; c=relaxed/simple; bh=v2IbOLpYo3a22Mop7BMSABJ+5BM1F7i4KaPCY4tFSYM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=OvbKavmRuIo0dmKt6h34d9JumMBGtFcxNY5DOgxe5ViSjuFfHHSu6SaWkaC2HwKC/rSiLHQwDne+ulYE9O58RvMV5kaYNJ2DEbiDwKvMayqN9VCJ90bBrX098thdQJCE+SbSqwE7pfStq3JVxAQB8MP1WyoXPvEeoGBfK3LIehE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Il8iLiJv; arc=none smtp.client-ip=209.85.221.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Il8iLiJv" Received: by mail-wr1-f42.google.com with SMTP id ffacd0b85a97d-37d6a2aa748so1185771f8f.1 for ; Fri, 15 Nov 2024 08:54:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731689642; x=1732294442; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9nJxolNITZkKjcgVgsRnwP3J3cPf7OR1t7KjI8tbabc=; b=Il8iLiJvE9rk07koKNC5aJDEjsmgfMgPY3qVNR3DaXJgzFjp0TrJZoi+8rhEHbMhEE gPMuHDMIH6DRgfgVPwWeHofG2/6FQZbXUL1dU+KMGoiOWjOgfaIT160XOfyCevsS1t8v wpMJWWU+ctLFppAIWrDxe8YS4dpWudYS6z0qF/ck24lwOdl35e4rpDCxJkTLsW4gyCcT KGFNyu3h+adigVMQKejqXv1SBjte+mpcj4IzVVcONZQE3GjQDZBaFXyspr+Q9WJyiMeK PMKYlxUk3pFqCqMAKZ973MsklnIzejhTTKHp27pS3aJypUnH54OzqVYaDG77G2/+hRrg jqQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731689642; x=1732294442; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9nJxolNITZkKjcgVgsRnwP3J3cPf7OR1t7KjI8tbabc=; b=M14Ipq+8Bj58m6fSKSTkjMyd2mRm82MDoOASnuEICwlnPBPI3qoPpr8jXUuWRUQlbt Ta/v0SKVDU5DqLVPlfVCCp+4Yt9W3TfIcNCXwalsHdWf+QMlc4pkZ35XAR/DfYG+rima XeOBr0ccuke/kH8nMm1RiiTg4tf38Glv0UU7MnngcL6PRNuO2y5M1Hbm09cN2wjIAoWw ncn7QFvUcWE5OClHRh5isuzMtkn2kRcXhtZfLVpH2Xamq3KbGUlGG9j1Os4hFYO4iYg0 F2ZX+NQ7ZAIKTzf90OLVrfd/HtMPCG1QCNDE0ae+gL9QfLcdD2oeHAOAY400HgcV6xVc Pj2Q== X-Gm-Message-State: AOJu0YynUO/Re+GTtajkPXeXMt9xiHBYYTBTa4744X14NCkaNjvG1d1v VGV/mYKQrLBzGt5vArLI/cRz4y/yt9O+cP0TeaHEmdlhetLZti0wzsXzaw== X-Google-Smtp-Source: AGHT+IEBdBlK9AiXc2O+f8O4n5eEA6oPQXOprqQB7A6AqMRK/dmBSJ8Y3Xx85NGjKMivXgZwOnhDvg== X-Received: by 2002:a05:6000:78b:b0:37c:babe:2c49 with SMTP id ffacd0b85a97d-382259062d0mr2202903f8f.19.1731689642258; Fri, 15 Nov 2024 08:54:02 -0800 (PST) Received: from 127.0.0.1localhost ([148.252.132.111]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3821ae2f651sm5011895f8f.87.2024.11.15.08.53.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Nov 2024 08:54:00 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 3/6] io_uring: temporarily disable registered waits Date: Fri, 15 Nov 2024 16:54:40 +0000 Message-ID: <70b1d1d218c41ba77a76d1789c8641dab0b0563e.1731689588.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Disable wait argument registration as it'll be replaced with a more generic feature. We'll still need IORING_ENTER_EXT_ARG_REG parsing in a few commits so leave it be. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 10 ----- include/uapi/linux/io_uring.h | 3 -- io_uring/io_uring.c | 10 ----- io_uring/register.c | 82 ---------------------------------- io_uring/register.h | 1 - 5 files changed, 106 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 072e65e93105..52a5da99a205 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -330,14 +330,6 @@ struct io_ring_ctx { atomic_t cq_wait_nr; atomic_t cq_timeouts; struct wait_queue_head cq_wait; - - /* - * If registered with IORING_REGISTER_CQWAIT_REG, a single - * page holds N entries, mapped in cq_wait_arg. cq_wait_index - * is the maximum allowable index. - */ - struct io_uring_reg_wait *cq_wait_arg; - unsigned char cq_wait_index; } ____cacheline_aligned_in_smp; /* timeouts */ @@ -431,8 +423,6 @@ struct io_ring_ctx { unsigned short n_sqe_pages; struct page **ring_pages; struct page **sqe_pages; - - struct page **cq_wait_page; }; struct io_tw_state { diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 5d08435b95a8..132f5db3d4e8 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -627,9 +627,6 @@ enum io_uring_register_op { /* resize CQ ring */ IORING_REGISTER_RESIZE_RINGS = 33, - /* register fixed io_uring_reg_wait arguments */ - IORING_REGISTER_CQWAIT_REG = 34, - /* this goes last */ IORING_REGISTER_LAST, diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 464a70bde7e6..286b7bb73978 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2709,7 +2709,6 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx) io_alloc_cache_free(&ctx->msg_cache, io_msg_cache_free); io_futex_cache_free(ctx); io_destroy_buffers(ctx); - io_unregister_cqwait_reg(ctx); mutex_unlock(&ctx->uring_lock); if (ctx->sq_creds) put_cred(ctx->sq_creds); @@ -3195,15 +3194,6 @@ void __io_uring_cancel(bool cancel_all) static struct io_uring_reg_wait *io_get_ext_arg_reg(struct io_ring_ctx *ctx, const struct io_uring_getevents_arg __user *uarg) { - struct io_uring_reg_wait *arg = READ_ONCE(ctx->cq_wait_arg); - - if (arg) { - unsigned int index = (unsigned int) (uintptr_t) uarg; - - if (index <= ctx->cq_wait_index) - return arg + index; - } - return ERR_PTR(-EFAULT); } diff --git a/io_uring/register.c b/io_uring/register.c index 45edfc57963a..3c5a3cfb186b 100644 --- a/io_uring/register.c +++ b/io_uring/register.c @@ -570,82 +570,6 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg) return ret; } -void io_unregister_cqwait_reg(struct io_ring_ctx *ctx) -{ - unsigned short npages = 1; - - if (!ctx->cq_wait_page) - return; - - io_pages_unmap(ctx->cq_wait_arg, &ctx->cq_wait_page, &npages, true); - ctx->cq_wait_arg = NULL; - if (ctx->user) - __io_unaccount_mem(ctx->user, 1); -} - -/* - * Register a page holding N entries of struct io_uring_reg_wait, which can - * be used via io_uring_enter(2) if IORING_GETEVENTS_EXT_ARG_REG is set. - * If that is set with IORING_GETEVENTS_EXT_ARG, then instead of passing - * in a pointer for a struct io_uring_getevents_arg, an index into this - * registered array is passed, avoiding two (arg + timeout) copies per - * invocation. - */ -static int io_register_cqwait_reg(struct io_ring_ctx *ctx, void __user *uarg) -{ - struct io_uring_cqwait_reg_arg arg; - struct io_uring_reg_wait *reg; - struct page **pages; - unsigned long len; - int nr_pages, poff; - int ret; - - if (ctx->cq_wait_page || ctx->cq_wait_arg) - return -EBUSY; - if (copy_from_user(&arg, uarg, sizeof(arg))) - return -EFAULT; - if (!arg.nr_entries || arg.flags) - return -EINVAL; - if (arg.struct_size != sizeof(*reg)) - return -EINVAL; - if (check_mul_overflow(arg.struct_size, arg.nr_entries, &len)) - return -EOVERFLOW; - if (len > PAGE_SIZE) - return -EINVAL; - /* offset + len must fit within a page, and must be reg_wait aligned */ - poff = arg.user_addr & ~PAGE_MASK; - if (len + poff > PAGE_SIZE) - return -EINVAL; - if (poff % arg.struct_size) - return -EINVAL; - - pages = io_pin_pages(arg.user_addr, len, &nr_pages); - if (IS_ERR(pages)) - return PTR_ERR(pages); - ret = -EINVAL; - if (nr_pages != 1) - goto out_free; - if (ctx->user) { - ret = __io_account_mem(ctx->user, 1); - if (ret) - goto out_free; - } - - reg = vmap(pages, 1, VM_MAP, PAGE_KERNEL); - if (reg) { - ctx->cq_wait_index = arg.nr_entries - 1; - WRITE_ONCE(ctx->cq_wait_page, pages); - WRITE_ONCE(ctx->cq_wait_arg, (void *) reg + poff); - return 0; - } - ret = -ENOMEM; - if (ctx->user) - __io_unaccount_mem(ctx->user, 1); -out_free: - io_pages_free(&pages, nr_pages); - return ret; -} - static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode, void __user *arg, unsigned nr_args) __releases(ctx->uring_lock) @@ -840,12 +764,6 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode, break; ret = io_register_resize_rings(ctx, arg); break; - case IORING_REGISTER_CQWAIT_REG: - ret = -EINVAL; - if (!arg || nr_args != 1) - break; - ret = io_register_cqwait_reg(ctx, arg); - break; default: ret = -EINVAL; break; diff --git a/io_uring/register.h b/io_uring/register.h index 3e935e8fa4b2..a5f39d5ef9e0 100644 --- a/io_uring/register.h +++ b/io_uring/register.h @@ -5,6 +5,5 @@ int io_eventfd_unregister(struct io_ring_ctx *ctx); int io_unregister_personality(struct io_ring_ctx *ctx, unsigned id); struct file *io_uring_register_get_file(unsigned int fd, bool registered); -void io_unregister_cqwait_reg(struct io_ring_ctx *ctx); #endif From patchwork Fri Nov 15 16:54:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13876641 Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com [209.85.221.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 70DDF1D54E1 for ; Fri, 15 Nov 2024 16:54:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731689648; cv=none; b=J+eoyMKV3GyBL1Ed5a2Ly86RriKjoIZn1dE9+G4gSr5I6BYhZNlo4AiSvEOxuL80bxRgTzeUDM9K9EJfU4/8RgAOKbGnIrTYHCfntduFZrkRmcplTAN7AS3YChcncy8vy5xrHLKoBAhrKxN1KvMcnmXS0vZndRMwlRXvOxZYlOk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731689648; c=relaxed/simple; bh=S7FIFrrAnR1l2CAz1yIA+wGLQfUPglkyF7eAhPGoa+w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nSu47yr2Uv5T5ShXTZid8qrmNUv5lQ4uGy0JFKjTcSsP6vndQGZsNDrgncAv3KO0hPArEZOGhSzPj53ZN1iRrHKu2rWDZBOeWDhNa9DE+/hguqnRUjWusbKtXBbISUIaxsfF6Eq5eD3U9WKmTTjG5wsg69aCb2e/nMCb0xa19pc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=XIg5kEHb; arc=none smtp.client-ip=209.85.221.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="XIg5kEHb" Received: by mail-wr1-f44.google.com with SMTP id ffacd0b85a97d-382026ba43eso1312345f8f.1 for ; Fri, 15 Nov 2024 08:54:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731689644; x=1732294444; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LsSdE7BYFAxX84ZY5hZZZzIv0Nnc53OMZhgH+/OhoB8=; b=XIg5kEHbdn8A9Oam8/9u2/3ZNkBc3YB2OmM4vxOa+vrkJurPlApsQiPxatro36iMbD aMdWjZnmVCHZusfSXzVcewnD4+5gu8wjRDNQPDuvG6kcodXCxy668J5e99Oslh+Uzjpa rO75AtPyJaTyac1uWhLeK/5mJhjFfIWGAqQm/SZ6E+RGR2E8qw6MeJyK4BY+PHdv0CLG hpl6gCetW6CLDMUOMWtx0hFGUACPM383xoBZiLWEx75qWKJyaf9ipGwPP/vGe99+yl67 hlGP3Qht/cSC0OM8kLf0GE/qUTo/r3cJkHtFyWbcDm8zkkf284fy9qzi80Zu+VZw7vsb PNyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731689644; x=1732294444; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LsSdE7BYFAxX84ZY5hZZZzIv0Nnc53OMZhgH+/OhoB8=; b=e9hXEI+cbExK4NMny489565xbzx9zZxkKbuIVzhAlQvJ748o0oEMzo1WmtRvLObZoQ 6lwKZqI4Wtggv6Xtnqlm1mK/w+/9QiShisRh+/4jw8PPVoJxu9x8N7sc+Xr0IvWERwqK 7yms7+pU9z82kTtIwKI+9/SeSU7l2vOStTrx5x0T0uQUqiDuGcPpnI8Gu0bHYvDxFPhI I6iAAxnd/ZUaI0qK9C1IJvIHSw0uzFbXtuvJ2v+JYItrahrgTZ/0D2Y4k9CXOHK0Zo35 I1wagDGh40vfSFR70JJrnQ/GYjZVAZ2R6zP76JdZYd5wtdQv8wKSGRBAPz7Hu8YkcNdU LNtw== X-Gm-Message-State: AOJu0Yy9jrXcczLWrB2cP+vGVPGdAF1YQrSGzXHHVwhW1yHNT2M6VyTH L0F2kBNEfWuC9CNM4yueY6H45uPLkZdifEWHd/eSg3DixISb6suscrDfoQ== X-Google-Smtp-Source: AGHT+IH5lOm4cYkIgL7fjOqZ/k9rlVzFE4J4TlIszv3K8JfzdkW/lY2o6Q6ddzEJNi23oha4jMe+jA== X-Received: by 2002:a05:6000:1868:b0:37d:3735:8fe7 with SMTP id ffacd0b85a97d-38225a295b5mr3170429f8f.32.1731689643845; Fri, 15 Nov 2024 08:54:03 -0800 (PST) Received: from 127.0.0.1localhost ([148.252.132.111]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3821ae2f651sm5011895f8f.87.2024.11.15.08.54.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Nov 2024 08:54:02 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 4/6] io_uring: introduce concept of memory regions Date: Fri, 15 Nov 2024 16:54:41 +0000 Message-ID: <0e6fe25818dfbaebd1bd90b870a6cac503fe1a24.1731689588.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 We've got a good number of mappings we share with the userspace, that includes the main rings, provided buffer rings, upcoming rings for zerocopy rx and more. All of them duplicate user argument parsing and some internal details as well (page pinnning, huge page optimisations, mmap'ing, etc.) Introduce a notion of regions. For userspace for now it's just a new structure called struct io_uring_region_desc which is supposed to parameterise all such mapping / queue creations. A region either represents a user provided chunk of memory, in which case the user_addr field should point to it, or a request for the kernel to allocate the memory, in which case the user would need to mmap it after using the offset returned in the mmap_offset field. With a uniform userspace API we can avoid additional boiler plate code and apply future optimisation to all of them at once. Internally, there is a new structure struct io_mapped_region holding all relevant runtime information and some helpers to work with it. This patch limits it to user provided regions. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 6 +++ include/uapi/linux/io_uring.h | 14 +++++++ io_uring/memmap.c | 67 ++++++++++++++++++++++++++++++++++ io_uring/memmap.h | 14 +++++++ 4 files changed, 101 insertions(+) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 52a5da99a205..1d3a37234ace 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -75,6 +75,12 @@ struct io_hash_table { unsigned hash_bits; }; +struct io_mapped_region { + struct page **pages; + void *vmap_ptr; + size_t nr_pages; +}; + /* * Arbitrary limit, can be raised if need be */ diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 132f5db3d4e8..5cbfd330c688 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -647,6 +647,20 @@ struct io_uring_files_update { __aligned_u64 /* __s32 * */ fds; }; +enum { + /* initialise with user provided memory pointed by user_addr */ + IORING_MEM_REGION_TYPE_USER = 1, +}; + +struct io_uring_region_desc { + __u64 user_addr; + __u64 size; + __u32 flags; + __u32 id; + __u64 mmap_offset; + __u64 __resv[4]; +}; + /* * Register a fully sparse file space, rather than pass in an array of all * -1 file descriptors. diff --git a/io_uring/memmap.c b/io_uring/memmap.c index 6ab59c60dfd0..bbd9569a0120 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -12,6 +12,7 @@ #include "memmap.h" #include "kbuf.h" +#include "rsrc.h" static void *io_mem_alloc_compound(struct page **pages, int nr_pages, size_t size, gfp_t gfp) @@ -194,6 +195,72 @@ void *__io_uaddr_map(struct page ***pages, unsigned short *npages, return ERR_PTR(-ENOMEM); } +void io_free_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr) +{ + if (mr->pages) { + unpin_user_pages(mr->pages, mr->nr_pages); + kvfree(mr->pages); + } + if (mr->vmap_ptr) + vunmap(mr->vmap_ptr); + if (mr->nr_pages && ctx->user) + __io_unaccount_mem(ctx->user, mr->nr_pages); + + memset(mr, 0, sizeof(*mr)); +} + +int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, + struct io_uring_region_desc *reg) +{ + int pages_accounted = 0; + struct page **pages; + int nr_pages, ret; + void *vptr; + u64 end; + + if (WARN_ON_ONCE(mr->pages || mr->vmap_ptr || mr->nr_pages)) + return -EFAULT; + if (memchr_inv(®->__resv, 0, sizeof(reg->__resv))) + return -EINVAL; + if (reg->flags != IORING_MEM_REGION_TYPE_USER) + return -EINVAL; + if (!reg->user_addr) + return -EFAULT; + if (!reg->size || reg->mmap_offset || reg->id) + return -EINVAL; + if ((reg->size >> PAGE_SHIFT) > INT_MAX) + return E2BIG; + if ((reg->user_addr | reg->size) & ~PAGE_MASK) + return -EINVAL; + if (check_add_overflow(reg->user_addr, reg->size, &end)) + return -EOVERFLOW; + + pages = io_pin_pages(reg->user_addr, reg->size, &nr_pages); + if (IS_ERR(pages)) + return PTR_ERR(pages); + + if (ctx->user) { + ret = __io_account_mem(ctx->user, nr_pages); + if (ret) + goto out_free; + pages_accounted = nr_pages; + } + + vptr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL); + if (!vptr) + goto out_free; + + mr->pages = pages; + mr->vmap_ptr = vptr; + mr->nr_pages = nr_pages; + return 0; +out_free: + if (pages_accounted) + __io_unaccount_mem(ctx->user, pages_accounted); + io_pages_free(&pages, nr_pages); + return ret; +} + static void *io_uring_validate_mmap_request(struct file *file, loff_t pgoff, size_t sz) { diff --git a/io_uring/memmap.h b/io_uring/memmap.h index 5cec5b7ac49a..f361a635b6c7 100644 --- a/io_uring/memmap.h +++ b/io_uring/memmap.h @@ -22,4 +22,18 @@ unsigned long io_uring_get_unmapped_area(struct file *file, unsigned long addr, unsigned long flags); int io_uring_mmap(struct file *file, struct vm_area_struct *vma); +void io_free_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr); +int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, + struct io_uring_region_desc *reg); + +static inline void *io_region_get_ptr(struct io_mapped_region *mr) +{ + return mr->vmap_ptr; +} + +static inline bool io_region_is_set(struct io_mapped_region *mr) +{ + return !!mr->nr_pages; +} + #endif From patchwork Fri Nov 15 16:54:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13876642 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 91A7C1D63FB for ; Fri, 15 Nov 2024 16:54:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731689650; cv=none; b=Jb09Nt1OTEhZ3cYRIXf3rUinusE6iDXQlAnUpHYrPbFWVzEx5qi8zU3kfFNghF3i/KaXICrZC6WP4ajRRGBR8alFPdRIlDsSae37pCGgsq3kCnHaP0tzwNLsO2SjjjjRo2pEoQvGioIx1KSnRRAn6ITc2/5e/HeHXdXnntVnd20= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731689650; c=relaxed/simple; bh=Hvv9oNYi9s1ueMiugtSN67O5L03GHYmR01AxOAkhpj4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=m0HWoyQCvsW7Stqew9WcaAlIcUWXWrqbehFUH6yp9cFmJin1vkXlNWXl9Y8GSxAgjuqC1ZVdZwbbS935ObqCBq3HMmif6mCs8uInin0YYYT8vRUDPvJwYX3hit1ox9CxMi8lLcEPoijucqTIvV+xLpYfEJSeh2GzGJt37OwenQc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=dGaHAEjq; arc=none smtp.client-ip=209.85.221.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="dGaHAEjq" Received: by mail-wr1-f49.google.com with SMTP id ffacd0b85a97d-38207c86695so1444622f8f.2 for ; Fri, 15 Nov 2024 08:54:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731689645; x=1732294445; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BBdhYIPS0vNHqswEqOk5yrUEAAdG6KLAA34YHKFu7Dg=; b=dGaHAEjqIbsUBrQmfccE+dUufFiMb3DsBHqx3x9cyVW8B+Zj+CQOCWP9OPWQt8r2eD lc4VZfnhzr5mWU0ImelSWfeCZkmcMl3rHnAmVBpwOAxcKo6xfCjhYXMyV9jzekpj+/U3 Sj0wzm3MLN28bsVh5mSGRpiH+LQjYn5PWO3m/wC7YMZksG5NHdBoZKywd5DaFs6dEC+z hPAwVOr2c0SiPuE7wBLrB0lr4/EJRv9r0/cJ7bwruLIlozXpCMiEZcwwJi4GfJZCTWGN NakCM1y0iys+50CC+H7xhyqKFVacjFIToUC6QWkJeJnDS0MEZBUzSVJuB+FawhO+uZY7 Z5qA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731689645; x=1732294445; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BBdhYIPS0vNHqswEqOk5yrUEAAdG6KLAA34YHKFu7Dg=; b=Pn4RP0FOtYcTUTP8Q5sZZ6bk82FuELNJ+OFYtfBVWxR91Wy1S9s2Ke+x7VwwBvwdzU qPQAF1V7x/0BNVYIwE9pd0knh4iC1tSJhAPMJL7oNnkSBPBQ/Rf1UHXlabQq/S/EcHIP OEly8FeV7Uw8IpHVDn2qnkS80KYMoPhfwOJ3DEXZ9ihlprQYrSaD4wzcgcKv7xNuiTAi 5Sgw4LMrV3tK1p3FPMzgNGJeTs/4AscvPHkSUy7ehcoB80G558iN9bC1inPBZdgL26R5 NHxwJF9hC2BJtHk3eJahXz7RZYay/usrw+n7JDoJrlecdG6ZznNVxklmDwJRc4c55pj1 o9wA== X-Gm-Message-State: AOJu0YwGp845+VIyEE4QXR9CRMnXBABKzFwYI+9gkgL6rd92QqJuBK8L 80PEVJ967+NfubQ8/md6+PAqQCRwW43ooOn5J8UXn6WjlHtcKsAk9jRXQw== X-Google-Smtp-Source: AGHT+IH0NVr6PYH1ZFuCERS3jlM3NzaeEUn12ufD9zkPU7T66hAyc1AZEH00+TvvD/SbO9oWvGylfQ== X-Received: by 2002:a5d:5847:0:b0:382:1504:f064 with SMTP id ffacd0b85a97d-38225a88a05mr2708795f8f.42.1731689645215; Fri, 15 Nov 2024 08:54:05 -0800 (PST) Received: from 127.0.0.1localhost ([148.252.132.111]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3821ae2f651sm5011895f8f.87.2024.11.15.08.54.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Nov 2024 08:54:04 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 5/6] io_uring: add memory region registration Date: Fri, 15 Nov 2024 16:54:42 +0000 Message-ID: <0798cf3a14fad19cfc96fc9feca5f3e11481691d.1731689588.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Regions will serve multiple purposes. First, with it we can decouple ring/etc. object creation from registration / mapping of the memory they will be placed in. We already have hacks that allow to put both SQ and CQ into the same huge page, in the future we should be able to: region = create_region(io_ring); create_pbuf_ring(io_uring, region, offset=0); create_pbuf_ring(io_uring, region, offset=N); The second use case is efficiently passing parameters. The following patch enables back on top of regions IORING_ENTER_EXT_ARG_REG, which optimises wait arguments. It'll also be useful for request arguments replacing iovecs, msghdr, etc. pointers. Eventually it would also be handy for BPF as well if it comes to fruition. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 3 +++ include/uapi/linux/io_uring.h | 8 ++++++++ io_uring/io_uring.c | 1 + io_uring/register.c | 37 ++++++++++++++++++++++++++++++++++ 4 files changed, 49 insertions(+) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 1d3a37234ace..e1d69123e164 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -429,6 +429,9 @@ struct io_ring_ctx { unsigned short n_sqe_pages; struct page **ring_pages; struct page **sqe_pages; + + /* used for optimised request parameter and wait argument passing */ + struct io_mapped_region param_region; }; struct io_tw_state { diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 5cbfd330c688..1ee35890125b 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -627,6 +627,8 @@ enum io_uring_register_op { /* resize CQ ring */ IORING_REGISTER_RESIZE_RINGS = 33, + IORING_REGISTER_MEM_REGION = 34, + /* this goes last */ IORING_REGISTER_LAST, @@ -661,6 +663,12 @@ struct io_uring_region_desc { __u64 __resv[4]; }; +struct io_uring_mem_region_reg { + __u64 region_uptr; /* struct io_uring_region_desc * */ + __u64 flags; + __u64 __resv[2]; +}; + /* * Register a fully sparse file space, rather than pass in an array of all * -1 file descriptors. diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 286b7bb73978..c640b8a4ceee 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2709,6 +2709,7 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx) io_alloc_cache_free(&ctx->msg_cache, io_msg_cache_free); io_futex_cache_free(ctx); io_destroy_buffers(ctx); + io_free_region(ctx, &ctx->param_region); mutex_unlock(&ctx->uring_lock); if (ctx->sq_creds) put_cred(ctx->sq_creds); diff --git a/io_uring/register.c b/io_uring/register.c index 3c5a3cfb186b..2cbac3d9b288 100644 --- a/io_uring/register.c +++ b/io_uring/register.c @@ -570,6 +570,37 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg) return ret; } +static int io_register_mem_region(struct io_ring_ctx *ctx, void __user *uarg) +{ + struct io_uring_mem_region_reg __user *reg_uptr = uarg; + struct io_uring_mem_region_reg reg; + struct io_uring_region_desc __user *rd_uptr; + struct io_uring_region_desc rd; + int ret; + + if (io_region_is_set(&ctx->param_region)) + return -EBUSY; + if (copy_from_user(®, reg_uptr, sizeof(reg))) + return -EFAULT; + rd_uptr = u64_to_user_ptr(reg.region_uptr); + if (copy_from_user(&rd, rd_uptr, sizeof(rd))) + return -EFAULT; + + if (memchr_inv(®.__resv, 0, sizeof(reg.__resv))) + return -EINVAL; + if (reg.flags) + return -EINVAL; + + ret = io_create_region(ctx, &ctx->param_region, &rd); + if (ret) + return ret; + if (copy_to_user(rd_uptr, &rd, sizeof(rd))) { + io_free_region(ctx, &ctx->param_region); + return -EFAULT; + } + return 0; +} + static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode, void __user *arg, unsigned nr_args) __releases(ctx->uring_lock) @@ -764,6 +795,12 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode, break; ret = io_register_resize_rings(ctx, arg); break; + case IORING_REGISTER_MEM_REGION: + ret = -EINVAL; + if (!arg || nr_args != 1) + break; + ret = io_register_mem_region(ctx, arg); + break; default: ret = -EINVAL; break; From patchwork Fri Nov 15 16:54:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13876643 Received: from mail-wr1-f48.google.com (mail-wr1-f48.google.com [209.85.221.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 02B9D1D5AB7 for ; Fri, 15 Nov 2024 16:54:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731689652; cv=none; b=MTl6Mn9+dd+yHv9zADi3Q/mpb6dvzjYZ2t1RydhWgQVfK5m21W5bGhD+oAH/DaUpkXF65LZL5ix/caAygldFCW0gc4LliWkxa7fv95KyjL7+6fGHGikknUjDdYXfBKO1ABhrp4GsX+JC/B9yXG5jZZBsP0yiM2mbrTulWERk/Gw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731689652; c=relaxed/simple; bh=EIbAFT0q3kOELpCDsKQQyLXO15L65KHz1yWcNvLLmhQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XjrgSIvt3poEV8yEc4K2u3b86orkRjLmuzHqVCQo6TQHHIzQQK5hapr6k4MGY4nlyvIueq+X59rv0j2v+rokmpIQ2ZvZWF25P8bxHwYZCG+rkFdQSIUyvn5ZlblhXriZHEN7lO0MgfMELzbAWsIM1DZec46/XVLIxKCXH1QSlwU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Z5ESlnpa; arc=none smtp.client-ip=209.85.221.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Z5ESlnpa" Received: by mail-wr1-f48.google.com with SMTP id ffacd0b85a97d-3822ba3cdbcso346058f8f.0 for ; Fri, 15 Nov 2024 08:54:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1731689647; x=1732294447; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=stTh6Z/X5iv3BRtWLpIQnccw9vPRWvFo0Vx0AyhXGUk=; b=Z5ESlnpahVRVGnL/2i+d6IL445r6RAtI5purAVPmBZe8sA2JYHPtaPSuBnBRW1XL3+ IIOMxbCnBLnsmOVuf3+uuliwthM9Ve99FyP0zEPOdaD9x1bltVKJ4fsP8KNZSLfbIwqQ DCU+13BDagJ3F+gT87VBPmCC1oSUaXDdrG0DBZpFEfWDs8ydFIw4Z9wAxEmL+28qzTxL 608vDiPp0nADPM0BcZejdhDLdPb9tVicTkB6wQVEaR5Tj6XNnY2gHtJr6q4Jb9tn0u1V bDv20epEQQkqMkEtuFFBIW60FXYXwmEArl4eW8sF52fKryPR7rL7xnQE1i6UGVQDdPcd bFUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731689647; x=1732294447; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=stTh6Z/X5iv3BRtWLpIQnccw9vPRWvFo0Vx0AyhXGUk=; b=eboqOjjnGmdj8zvl7W/5f0IFHSS6IlW7eY3C1Opm4uqWmcyydCnzXphrkNpe42HQSN xEv9whWsuoRJCoOfrVmgl1h5KPUrTJpysgVxlWQTG0309KyzApExHJX8mMiY80Tyfl6E hDjMVRojo7mPGTV83GAbeqRqXfnnHJaXOse1JiEuTS2ENFG1ptLDqlMoNhOl4dSafnI8 kRx6c54aDESOvYeUAQT878CpXl9JtsOsNYTHDhzh5Nal0J7HDEZ1deSqSyejyY3M7mp9 WnXjLLvw3wBnjgjUcujQTpE+9SRa/Qoog3Z8cluhiSOgKpm75NIdTYIGzzwPme7/fak9 39jg== X-Gm-Message-State: AOJu0YxD4k/AzXJvoP8eQtzj97VJzwO6wapIkXC6AuyVmlLD32ox3AIx e88VypaQFkEEADfFJpjS+b6Pk7rDhy7UIRw4rKiw82J85ZmifQha3lJRvQ== X-Google-Smtp-Source: AGHT+IHiqAyl/4K6MoZxsKhNsBln47g8pRtzc2h/QAre3wrl0HzEstBRbXgb378X4Ag5glOGiBUsXA== X-Received: by 2002:a05:6000:991:b0:381:f443:21b9 with SMTP id ffacd0b85a97d-38225901b81mr2461172f8f.3.1731689647419; Fri, 15 Nov 2024 08:54:07 -0800 (PST) Received: from 127.0.0.1localhost ([148.252.132.111]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3821ae2f651sm5011895f8f.87.2024.11.15.08.54.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Nov 2024 08:54:05 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 6/6] io_uring: restore back registered wait arguments Date: Fri, 15 Nov 2024 16:54:43 +0000 Message-ID: <81822c1b4ffbe8ad391b4f9ad1564def0d26d990.1731689588.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Now we've got a more generic region registration API, place IORING_ENTER_EXT_ARG_REG and re-enable it. First, the user has to register a region with the IORING_MEM_REGION_REG_WAIT_ARG flag set. It can only be done for a ring in a disabled state, aka IORING_SETUP_R_DISABLED, to avoid races with already running waiters. With that we should have stable constant values for ctx->cq_wait_{size,arg} in io_get_ext_arg_reg() and hence no READ_ONCE required. The other API difference is that we're now passing byte offsets instead of indexes. The user _must_ align all offsets / pointers to the native word size, failing to do so might but not necessarily has to lead to a failure usually returned as -EFAULT. liburing will be hiding this details from users. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 3 +++ include/uapi/linux/io_uring.h | 5 +++++ io_uring/io_uring.c | 14 +++++++++++++- io_uring/register.c | 16 +++++++++++++++- 4 files changed, 36 insertions(+), 2 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index e1d69123e164..aa5f5ea98076 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -324,6 +324,9 @@ struct io_ring_ctx { unsigned cq_entries; struct io_ev_fd __rcu *io_ev_fd; unsigned cq_extra; + + void *cq_wait_arg; + size_t cq_wait_size; } ____cacheline_aligned_in_smp; /* diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 1ee35890125b..4418d0192959 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -663,6 +663,11 @@ struct io_uring_region_desc { __u64 __resv[4]; }; +enum { + /* expose the region as registered wait arguments */ + IORING_MEM_REGION_REG_WAIT_ARG = 1, +}; + struct io_uring_mem_region_reg { __u64 region_uptr; /* struct io_uring_region_desc * */ __u64 flags; diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index c640b8a4ceee..c93a6a9cd47e 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3195,7 +3195,19 @@ void __io_uring_cancel(bool cancel_all) static struct io_uring_reg_wait *io_get_ext_arg_reg(struct io_ring_ctx *ctx, const struct io_uring_getevents_arg __user *uarg) { - return ERR_PTR(-EFAULT); + unsigned long size = sizeof(struct io_uring_reg_wait); + unsigned long offset = (uintptr_t)uarg; + unsigned long end; + + if (unlikely(offset % sizeof(long))) + return ERR_PTR(-EFAULT); + + /* also protects from NULL ->cq_wait_arg as the size would be 0 */ + if (unlikely(check_add_overflow(offset, size, &end) || + end >= ctx->cq_wait_size)) + return ERR_PTR(-EFAULT); + + return ctx->cq_wait_arg + offset; } static int io_validate_ext_arg(struct io_ring_ctx *ctx, unsigned flags, diff --git a/io_uring/register.c b/io_uring/register.c index 2cbac3d9b288..1a60f4916649 100644 --- a/io_uring/register.c +++ b/io_uring/register.c @@ -588,7 +588,16 @@ static int io_register_mem_region(struct io_ring_ctx *ctx, void __user *uarg) if (memchr_inv(®.__resv, 0, sizeof(reg.__resv))) return -EINVAL; - if (reg.flags) + if (reg.flags & ~IORING_MEM_REGION_REG_WAIT_ARG) + return -EINVAL; + + /* + * This ensures there are no waiters. Waiters are unlocked and it's + * hard to synchronise with them, especially if we need to initialise + * the region. + */ + if ((reg.flags & IORING_MEM_REGION_REG_WAIT_ARG) && + !(ctx->flags & IORING_SETUP_R_DISABLED)) return -EINVAL; ret = io_create_region(ctx, &ctx->param_region, &rd); @@ -598,6 +607,11 @@ static int io_register_mem_region(struct io_ring_ctx *ctx, void __user *uarg) io_free_region(ctx, &ctx->param_region); return -EFAULT; } + + if (reg.flags & IORING_MEM_REGION_REG_WAIT_ARG) { + ctx->cq_wait_arg = io_region_get_ptr(&ctx->param_region); + ctx->cq_wait_size = rd.size; + } return 0; }