From patchwork Fri Nov 29 13:34:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13888688 Received: from mail-ej1-f52.google.com (mail-ej1-f52.google.com [209.85.218.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D672319CC27 for ; Fri, 29 Nov 2024 13:33:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887241; cv=none; b=C8aegCkn7E1l60yxrXB3a3LMdzVZNTZsgc9GWz/kCgfn8Wf/mnu+xVjTOX5vJ54S6NIgKT7axhG5b2BFGLV6pxO+BfM3x7rjlGenHjYjlFQyPs/5ILdeWrFujxcJltqM36saLTCNmROrm2anzWheCaygbYPZIwxKn7tGu1R0l/s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887241; c=relaxed/simple; bh=CCi4Jix1uDK9GQZ7mtxGXbPW/dWToASaiUqwAxORkwY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iYEL+lTroeCJoYPs3xEJ8oDN32echfY73uOjm2il0bDggX+49sompz67HSczwJl03jjCpxGINVC/Gj/rcHJJx4BJ1SeMIhmVY9XDWia9y/kfr95YPbpmlHHs6C22ZY5uhdO2iFssFRPFTiac3g02qwJrVe6gQLDMTpKycUp+Ho4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ILN2QUae; arc=none smtp.client-ip=209.85.218.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ILN2QUae" Received: by mail-ej1-f52.google.com with SMTP id a640c23a62f3a-aa549d9dffdso254434666b.2 for ; Fri, 29 Nov 2024 05:33:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732887238; x=1733492038; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HGyGpWFvQ4t6xuQJn1Ep7HeBT0Y2mnZvrg/LEd7v9FA=; b=ILN2QUae2qJ40VqhBjryOAjy7aVXYWSePFichVJby+OxDLhM3E7ewt/bWRvyj79vQ3 8iznIwo47We4i2aBgL02ppZqcbOMkNSpXaYy6VDdSYcaZ0iLy/IZ7RB6jyMENg9hv38i 5iOCkggfXrqb6qFLzYM3Kw5NiU/EbvYwcG+chr48c2MxdT4Lw23bFnjebLm2tmc9HMCp iiIi1lEYPIjQcZlLMVaaD4d8DKGPa54J9mLvI1cFI4r3jH2755VGtFILYEGL3AEI4FET NVtLWE/rBvkxKCcxdHEZbKp+6B8LYFkhr2fE6/BI8XYu3UHawjXuDG8DzM8ARSNKLt5Q NNxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732887238; x=1733492038; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HGyGpWFvQ4t6xuQJn1Ep7HeBT0Y2mnZvrg/LEd7v9FA=; b=uVNqRsjZniB3eaHldkq/ChWX8lRg9Y9Jzh/WeEfU6aAZrfRjZdxZ1eYsDjMZ94N5km X0F6amkLFSwc20T35WnyyDELm17UUdD+lJnB2lXVPxyYnhogFrX7m2NaSxUZ5wkkEnW8 EGgt1+q/TZRjks+xJwpwQfcPY102qFi90w/TJPpyo3TiGmCj2+IVG2pJa0kyZiYSqZPj RN4IbOwuCbZCYl3ex10I1op7LWeozITl5A8JxEzwDy8XLfeaZscZ/3FLi+aLg8CBuFxz e1VsrrDdDrcG7WPe/qxA78801zGgtUYpjnaOmlm9hou7M7rwALs75imIbUTZoxLMi+oj K+1Q== X-Gm-Message-State: AOJu0YzAdWfb4+zPMNW2H1MK/vYD+86Cl0vPaEacHCj+C2WVe2r2TGmf e2yWwxQjGFe4OFXSUuAPpdqOOuOCU47xZ0D1Eg0CNq9P4J08ibjbBeKFXQ== X-Gm-Gg: ASbGncvMN1E7RWZLVNoJSOI3j2RZkeKkqmctQ7Xmhny3BL//3ZHF+phRUTv8qyH519z M9UwW87oNfLsL9RxFOIK71GCaGUW93IRGcydClHfvrC41vY0o8pUZTtoLowt5mSP6LHaC4+S4R8 3moD39R+wCGzTYKvt0p9vHvaN/Vagay2VmV9OJ7qew4UNkEc8yMSFPrTJfvkJqHo82xpDQ46gmp Nr5+ln/DqN49BR5lhPtpvpxHbpYD4VwpjZzEHpb2/hxVvED5feO+CdVif4gXsBO X-Google-Smtp-Source: AGHT+IF6V+WpI2m2lFCw9V9kU+mgpmhycxyl5zmaRsXQiiIPWim7xoFpqI5JFdw3zaOGOvN7OM53SQ== X-Received: by 2002:a17:906:2192:b0:aa4:fc7c:ea6d with SMTP id a640c23a62f3a-aa580f2c4a3mr939871666b.17.1732887237643; Fri, 29 Nov 2024 05:33:57 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa5996c2471sm173996866b.13.2024.11.29.05.33.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:33:57 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 01/18] io_uring: rename ->resize_lock Date: Fri, 29 Nov 2024 13:34:22 +0000 Message-ID: <68f705306f3ac4d2fb999eb80ea1615015ce9f7f.1732886067.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 ->resize_lock is used for resizing rings, but it's a good idea to reuse it in other cases as well. Rename it into mmap_lock as it's protects from races with mmap. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 2 +- io_uring/io_uring.c | 2 +- io_uring/memmap.c | 6 +++--- io_uring/register.c | 8 ++++---- 4 files changed, 9 insertions(+), 9 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 3e934feb3187..adb36e0da40e 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -423,7 +423,7 @@ struct io_ring_ctx { * side will need to grab this lock, to prevent either side from * being run concurrently with the other. */ - struct mutex resize_lock; + struct mutex mmap_lock; /* * If IORING_SETUP_NO_MMAP is used, then the below holds diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index ae199e44da57..c713ef35447b 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -351,7 +351,7 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) INIT_WQ_LIST(&ctx->submit_state.compl_reqs); INIT_HLIST_HEAD(&ctx->cancelable_uring_cmd); io_napi_init(ctx); - mutex_init(&ctx->resize_lock); + mutex_init(&ctx->mmap_lock); return ctx; diff --git a/io_uring/memmap.c b/io_uring/memmap.c index 57de9bccbf50..a0d4151d11af 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -329,7 +329,7 @@ __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) unsigned int npages; void *ptr; - guard(mutex)(&ctx->resize_lock); + guard(mutex)(&ctx->mmap_lock); ptr = io_uring_validate_mmap_request(file, vma->vm_pgoff, sz); if (IS_ERR(ptr)) @@ -365,7 +365,7 @@ unsigned long io_uring_get_unmapped_area(struct file *filp, unsigned long addr, if (addr) return -EINVAL; - guard(mutex)(&ctx->resize_lock); + guard(mutex)(&ctx->mmap_lock); ptr = io_uring_validate_mmap_request(filp, pgoff, len); if (IS_ERR(ptr)) @@ -415,7 +415,7 @@ unsigned long io_uring_get_unmapped_area(struct file *file, unsigned long addr, struct io_ring_ctx *ctx = file->private_data; void *ptr; - guard(mutex)(&ctx->resize_lock); + guard(mutex)(&ctx->mmap_lock); ptr = io_uring_validate_mmap_request(file, pgoff, len); if (IS_ERR(ptr)) diff --git a/io_uring/register.c b/io_uring/register.c index 1e99c783abdf..ba61697d7a53 100644 --- a/io_uring/register.c +++ b/io_uring/register.c @@ -486,15 +486,15 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg) } /* - * We'll do the swap. Grab the ctx->resize_lock, which will exclude + * We'll do the swap. Grab the ctx->mmap_lock, which will exclude * any new mmap's on the ring fd. Clear out existing mappings to prevent * mmap from seeing them, as we'll unmap them. Any attempt to mmap * existing rings beyond this point will fail. Not that it could proceed * at this point anyway, as the io_uring mmap side needs go grab the - * ctx->resize_lock as well. Likewise, hold the completion lock over the + * ctx->mmap_lock as well. Likewise, hold the completion lock over the * duration of the actual swap. */ - mutex_lock(&ctx->resize_lock); + mutex_lock(&ctx->mmap_lock); spin_lock(&ctx->completion_lock); o.rings = ctx->rings; ctx->rings = NULL; @@ -561,7 +561,7 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg) ret = 0; out: spin_unlock(&ctx->completion_lock); - mutex_unlock(&ctx->resize_lock); + mutex_unlock(&ctx->mmap_lock); io_register_free_rings(&p, to_free); if (ctx->sq_data) From patchwork Fri Nov 29 13:34:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13888689 Received: from mail-ej1-f50.google.com (mail-ej1-f50.google.com [209.85.218.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6BFD61990C1 for ; Fri, 29 Nov 2024 13:34:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887242; cv=none; b=a8+/vhAbaEcVnl5wD8+2FsDkIHhR0nhkKWKhPMNnx6ffjhNvG0mHMD2INPBxSPZHNM06FPf4+NuyFWuu53jy4qYm7FKVewKe5rkmwg83cSipjKrrf9vojUPjay8FNIIkAiNtu/ERqdB2sgeMKXFV7BnMctwPGu6nRya//CBX/Vo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887242; c=relaxed/simple; bh=gaoXkVQK/6Blrim3pGwGX4MUitHU/mfc/W6Tz0e1OaQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ANyXkoQSqlOBicAbe8I2WVeDB9gO9IP1Lu43ybkR4FnE+Ud3P3C69uXviEpKUnj33WTQCn0z8g8C60FX5UZ0sUq4/l88YWQtas0s6i9/J3JLs46TT0OE10SULYPwT9EP1mJpsuCTNs1QMgtdts2IZDovWPWHeyN6IUltjxuntPo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=PT/sjML/; arc=none smtp.client-ip=209.85.218.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="PT/sjML/" Received: by mail-ej1-f50.google.com with SMTP id a640c23a62f3a-aa503cced42so260484266b.3 for ; Fri, 29 Nov 2024 05:34:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732887238; x=1733492038; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3ioUfww6d+xizvBRIqAxbXDWFBnTg25uUcpGNl2DpxI=; b=PT/sjML/WRqH+sBtbU37oybP5L0xFZnK4MSt2gJJaLPTqypltzzZqQLBHeSrfynrEz VAJUeCTtq2gRVXYf9cvVSmCbMBoNWfWtQVd8MxtOGuSC1TEj8wiZe/TtbUUkBVHE1Haj bnofmWXGzbLrkV+7oIKILrkM9xoLPUIyxmZ9z1wCLfOS+cBIwv2OlExYWUS4M+XkudGl bbFkh6x2/8+Kgf6rA8saO8UAixFH57IWqgOliLVTuz6d8wZSBN8EEGNNk33kCNjymD3j G9qxq69WXMfbgljouhle1YfsUdw1HRYbfVpzDOl3LKCzrJTUmBK9M/5VjVt7wiAbiops gXJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732887238; x=1733492038; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3ioUfww6d+xizvBRIqAxbXDWFBnTg25uUcpGNl2DpxI=; b=t7JPMg4r7GHLEORSdb+d5TlJsK13zA2UNyIFtxCej7xnHFleEYn922OJtqf5ycXxi5 r4+3qjMpn9NEczVU3X60F6K8cSS2SKQhvystNmJ6poYBzsqhp9WQNjtAy+P8LiSF0HrH 0J3lj8xnHZ5/OtcgE+7C4xVspGsAOF4PWwAWap+Ru3utC6NUJ6tWGVSU8aVrh6zb+vbb qBCbucD1/JYiMbQ4NEU5U7SWudmzwSsvCbX4F1Nw3zePnA1VvuY6KDWwhDM+gEa1cw9w XcRlup/G++TrwUwPkyM1PgNbdrKHS3X6ll+vJBAnYxC0gTbCJhXBgI8R5IKUClpxIL+e ulZg== X-Gm-Message-State: AOJu0YwFjsWS4LR4/R/4mtTKmktswFJ2q+5V29cFo0gHms0sYOPs0F1W meKwGZjZD8XAos0rYndOyyjnzYvVXw8bu6VPb3pS5cNMeR7PJESNalUf4Q== X-Gm-Gg: ASbGncvjmBCXuN8ubfzUlkXWBuq3CY23G7Vb83dyq1RQw42FpH17+FsGjL5wuHZj1Ps gL6+y98o/sWksAAY6Wpwqoq4GKKSjt7VxeGaB5DsAfXMILuC3Dervjaw2b9Lgle3xYmFCwgyRaG 3lQwxOvqimQpeKCQNAb0Nr3W6yPXlzpjSEcMCp6rO9Oap5rqvGv7AHl0GmSTRH6lHJW/rBRia/q AOAtlle3/8ZRYobAbzj+L6u4PJ+FW0UjwePEF6+6rsucP3KXx+pXvKre4VAL9Lm X-Google-Smtp-Source: AGHT+IG/E9Pevr9S3N9dRvZeS4JpzGj93YMh+NZmIjGRxBdZJychllAeQAR5qarAGwrwcdCw82CC9A== X-Received: by 2002:a17:907:7808:b0:aa5:2d9a:152c with SMTP id a640c23a62f3a-aa580f1e03emr863113866b.13.1732887238290; Fri, 29 Nov 2024 05:33:58 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa5996c2471sm173996866b.13.2024.11.29.05.33.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:33:57 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 02/18] io_uring/rsrc: export io_check_coalesce_buffer Date: Fri, 29 Nov 2024 13:34:23 +0000 Message-ID: <353b447953cd5d34c454a7d909bb6024c391d6e2.1732886067.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 io_try_coalesce_buffer() is a useful helper collecting useful info about a set of pages, I want to reuse it for analysing ring/etc. mappings. I don't need the entire thing and only interested if it can be coalesced into a single page, but that's better than duplicating the parsing. Signed-off-by: Pavel Begunkov --- io_uring/rsrc.c | 22 ++++++++++++---------- io_uring/rsrc.h | 4 ++++ 2 files changed, 16 insertions(+), 10 deletions(-) diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index adaae8630932..e51e5ddae728 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -626,11 +626,12 @@ static int io_buffer_account_pin(struct io_ring_ctx *ctx, struct page **pages, return ret; } -static bool io_do_coalesce_buffer(struct page ***pages, int *nr_pages, - struct io_imu_folio_data *data, int nr_folios) +static bool io_coalesce_buffer(struct page ***pages, int *nr_pages, + struct io_imu_folio_data *data) { struct page **page_array = *pages, **new_array = NULL; int nr_pages_left = *nr_pages, i, j; + int nr_folios = data->nr_folios; /* Store head pages only*/ new_array = kvmalloc_array(nr_folios, sizeof(struct page *), @@ -667,15 +668,14 @@ static bool io_do_coalesce_buffer(struct page ***pages, int *nr_pages, return true; } -static bool io_try_coalesce_buffer(struct page ***pages, int *nr_pages, - struct io_imu_folio_data *data) +bool io_check_coalesce_buffer(struct page **page_array, int nr_pages, + struct io_imu_folio_data *data) { - struct page **page_array = *pages; struct folio *folio = page_folio(page_array[0]); unsigned int count = 1, nr_folios = 1; int i; - if (*nr_pages <= 1) + if (nr_pages <= 1) return false; data->nr_pages_mid = folio_nr_pages(folio); @@ -687,7 +687,7 @@ static bool io_try_coalesce_buffer(struct page ***pages, int *nr_pages, * Check if pages are contiguous inside a folio, and all folios have * the same page count except for the head and tail. */ - for (i = 1; i < *nr_pages; i++) { + for (i = 1; i < nr_pages; i++) { if (page_folio(page_array[i]) == folio && page_array[i] == page_array[i-1] + 1) { count++; @@ -715,7 +715,8 @@ static bool io_try_coalesce_buffer(struct page ***pages, int *nr_pages, if (nr_folios == 1) data->nr_pages_head = count; - return io_do_coalesce_buffer(pages, nr_pages, data, nr_folios); + data->nr_folios = nr_folios; + return true; } static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx, @@ -729,7 +730,7 @@ static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx, size_t size; int ret, nr_pages, i; struct io_imu_folio_data data; - bool coalesced; + bool coalesced = false; if (!iov->iov_base) return NULL; @@ -749,7 +750,8 @@ static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx, } /* If it's huge page(s), try to coalesce them into fewer bvec entries */ - coalesced = io_try_coalesce_buffer(&pages, &nr_pages, &data); + if (io_check_coalesce_buffer(pages, nr_pages, &data)) + coalesced = io_coalesce_buffer(&pages, &nr_pages, &data); imu = kvmalloc(struct_size(imu, bvec, nr_pages), GFP_KERNEL); if (!imu) diff --git a/io_uring/rsrc.h b/io_uring/rsrc.h index 7a4668deaa1a..c8b093584461 100644 --- a/io_uring/rsrc.h +++ b/io_uring/rsrc.h @@ -40,6 +40,7 @@ struct io_imu_folio_data { /* For non-head/tail folios, has to be fully included */ unsigned int nr_pages_mid; unsigned int folio_shift; + unsigned int nr_folios; }; struct io_rsrc_node *io_rsrc_node_alloc(struct io_ring_ctx *ctx, int type); @@ -66,6 +67,9 @@ int io_register_rsrc_update(struct io_ring_ctx *ctx, void __user *arg, int io_register_rsrc(struct io_ring_ctx *ctx, void __user *arg, unsigned int size, unsigned int type); +bool io_check_coalesce_buffer(struct page **page_array, int nr_pages, + struct io_imu_folio_data *data); + static inline struct io_rsrc_node *io_rsrc_node_lookup(struct io_rsrc_data *data, int index) { From patchwork Fri Nov 29 13:34:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13888690 Received: from mail-ej1-f48.google.com (mail-ej1-f48.google.com [209.85.218.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5435019E806 for ; Fri, 29 Nov 2024 13:34:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887243; cv=none; b=HlMQ+BQzrUSv2OMzVGqEWLYK7VLBmVzJL9XIMhUZ1IaedaKBll7gwMScH7dzHd31YH9d4Etn5B8S2XxgZ0UQ+dU/oeJo9V4/ijg9/bX8TaNE+qttGOSbis3lWvJxH7Cv2Hae4NHMU9IxTY3ks8V3VRBTbNL1Tj4LoLvJA/Ci+BI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887243; c=relaxed/simple; bh=lxgwNXsgAQULEZiqtAC9jgEbihBiURSI4NO0D5BvzgI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ggYemEcjQ8ZNb/i4WS0jAh2j1VwcBt1uTtR44mhqSKAZdbvpKvQqC1ZLLUfQPzsiBk3kLut7DTuqhNFOPRy8BxCoxdISjKvp8RF+WNbjJYuaXpC5jgqys/klbp1QRTabJQ8xER9VCbVti8QyQ1zE3JfHdKNvBoo2PuKdkexvLHg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=L8pD3vXs; arc=none smtp.client-ip=209.85.218.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="L8pD3vXs" Received: by mail-ej1-f48.google.com with SMTP id a640c23a62f3a-a9a0ef5179dso248258766b.1 for ; Fri, 29 Nov 2024 05:34:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732887239; x=1733492039; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Mpfk03/A34fMYNA5TQQ0uM4WtTPml/veC3omIRerQwY=; b=L8pD3vXs47swq7GXWagx5xYNXjktIM/5g+jQiwlVGWROlt2CJ9IPNgpjtZi1BSmCZv qLNqN5Hz4+BwzzIQKwS/sxD3UyYyNQF3zUppG2WESMmAGfLAbM+21XhRLYWHhL06+ddW CxWNQcoBFBip070S/zQCnqvzpn9g4DgB9XuVHTsV+QB7WjoJbOxe/LYMs/92rMthmdnc EGfaXKY9RxfOinSr3l5djX7oBdq3Wj9dAi6UTzbo6jCTLNmV/HNLzdft90tvA4RDC9nF o5wltqCMjY+y8dG2XTWVCe5GE9Azj2q3dn1zT3dLrQfgqNyHTMuWWAyTxKggJGplMq0o N6qQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732887239; x=1733492039; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Mpfk03/A34fMYNA5TQQ0uM4WtTPml/veC3omIRerQwY=; b=Hd7WlWDiJ6+/3V0NJfX1Ea3Wy2N7fnXYnwQ1awe5I20CelNmrSYbgjrj/1CE8roovD 0ZIURtwCZRHWNb6JKprjoqCVjhnRzSEbrWNA5UwVNeVHlNrdt7fN0dYEloED+wPxt3fs FzucE0MwQtbIF0BBGw8AItWVFs+qZr3saN6LLU+Q9BSqMSz1XEQKyEubHONoVjQ8hgra 3mcV9C9XfV49OfWeGjZYESaSZbI8kiBg1V29+VNe/PRxFNrIEa0x2sxEGrUAouIHfJ5e ZZ39wdAPxWMyz4VKf+NND4JdU9Z3UVDGBgzbeU+UEdW+Ze/kruOHGEvMhrd8dotvWB1W r0oA== X-Gm-Message-State: AOJu0YzdIeahkoKYhFtjo/Ad2m3XGLiQ9Dos2JytUxGobY/GQRZ4V+Q8 OPtGuAk3BnzHHlYHIaEtlYFEv8xd2wCBOif2AJYbfppdE7E07bIss1RCHA== X-Gm-Gg: ASbGncs09TXKFajmbmDIJ3HHP5bGTd7jvvV7nVePmOA04Btc88BUqJZnVb40Tj7FCp3 VfIdzvxQzGHdznvQ0Q8N9ErMJKEMyQS1ciUM9drtIWWHMzguN+LhPhgdbvr4D1SQ6dcBoFApnml 37TenrP/vu8eGpQWbZtddKFOk7BZb+1QKp2nqJ96NvfWSHpOD61v1CjjrQKndSYzXCov0jMDwMH 5zS1L84ErN0IOvSja2hMhQtEOeJID7267fRDPnrBfzDqwBoFnN9MgyzR2y40YQ/ X-Google-Smtp-Source: AGHT+IEx18Y2eRtotDUUoSe+2iSEZqfzTfnJPAJbSK38mnVNFhgip4sxnRbnhNtPhyMBDVTtINGL7A== X-Received: by 2002:a17:907:7241:b0:a9e:c947:8c5e with SMTP id a640c23a62f3a-aa5810634e2mr1147167466b.57.1732887238785; Fri, 29 Nov 2024 05:33:58 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa5996c2471sm173996866b.13.2024.11.29.05.33.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:33:58 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 03/18] io_uring/memmap: flag vmap'ed regions Date: Fri, 29 Nov 2024 13:34:24 +0000 Message-ID: <5a3d8046a038da97c0f8a8c8f1733fa3fc689d31.1732886067.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add internal flags for struct io_mapped_region. The first flag we need is IO_REGION_F_VMAPPED, that indicates that the pointer has to be unmapped on region destruction. For now all regions are vmap'ed, so it's set unconditionally. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 5 +++-- io_uring/memmap.c | 14 ++++++++++---- io_uring/memmap.h | 2 +- 3 files changed, 14 insertions(+), 7 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index adb36e0da40e..4cee414080fd 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -77,8 +77,9 @@ struct io_hash_table { struct io_mapped_region { struct page **pages; - void *vmap_ptr; - size_t nr_pages; + void *ptr; + unsigned nr_pages; + unsigned flags; }; /* diff --git a/io_uring/memmap.c b/io_uring/memmap.c index a0d4151d11af..31fb8c8ffe4e 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -202,14 +202,19 @@ void *__io_uaddr_map(struct page ***pages, unsigned short *npages, return ERR_PTR(-ENOMEM); } +enum { + /* memory was vmap'ed for the kernel, freeing the region vunmap's it */ + IO_REGION_F_VMAP = 1, +}; + void io_free_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr) { if (mr->pages) { unpin_user_pages(mr->pages, mr->nr_pages); kvfree(mr->pages); } - if (mr->vmap_ptr) - vunmap(mr->vmap_ptr); + if ((mr->flags & IO_REGION_F_VMAP) && mr->ptr) + vunmap(mr->ptr); if (mr->nr_pages && ctx->user) __io_unaccount_mem(ctx->user, mr->nr_pages); @@ -225,7 +230,7 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, void *vptr; u64 end; - if (WARN_ON_ONCE(mr->pages || mr->vmap_ptr || mr->nr_pages)) + if (WARN_ON_ONCE(mr->pages || mr->ptr || mr->nr_pages)) return -EFAULT; if (memchr_inv(®->__resv, 0, sizeof(reg->__resv))) return -EINVAL; @@ -260,8 +265,9 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, } mr->pages = pages; - mr->vmap_ptr = vptr; + mr->ptr = vptr; mr->nr_pages = nr_pages; + mr->flags |= IO_REGION_F_VMAP; return 0; out_free: if (pages_accounted) diff --git a/io_uring/memmap.h b/io_uring/memmap.h index f361a635b6c7..2096a8427277 100644 --- a/io_uring/memmap.h +++ b/io_uring/memmap.h @@ -28,7 +28,7 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, static inline void *io_region_get_ptr(struct io_mapped_region *mr) { - return mr->vmap_ptr; + return mr->ptr; } static inline bool io_region_is_set(struct io_mapped_region *mr) From patchwork Fri Nov 29 13:34:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13888691 Received: from mail-ej1-f54.google.com (mail-ej1-f54.google.com [209.85.218.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E586619E980 for ; Fri, 29 Nov 2024 13:34:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887243; cv=none; b=M+g7ZZ0hpfHWraVdxqTX3HNJvveaovIiNwFaLdWeMn0v/Eb6hZjgig92d0PrJN8LXjcUI8rAO3ub7WFIdR+Y95cOrqbuLL6TyVivzgR7LWo1SSeT2Zqm4/d36SZ3dALQ09GDXzZsuSRbOXcqjOT+Ub6Q6Hjg3i2Z4hmGUCeDmD4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887243; c=relaxed/simple; bh=eBy///SVgAPRtDm8yj1s8ccjZKDipQkxK2Ssw37Th8s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=X/leaCW11zJuZY5xVg4VHfCMK3UwehT/x2HL1MSGbns477fAxbePqcHe6vUqd42GLmkQJiwU9fB9JNXg2Dvl21HZwt/0BmrI9VYkR6aj8kKbf9RUgNB4uHJ81rKegOjM2rE2C+8T/hPaTbWj7QQM7UlGlWJwi81kwBA6M9Ohrpg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=e4vwA0ox; arc=none smtp.client-ip=209.85.218.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="e4vwA0ox" Received: by mail-ej1-f54.google.com with SMTP id a640c23a62f3a-aa52edbcb63so516030466b.1 for ; Fri, 29 Nov 2024 05:34:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732887239; x=1733492039; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Er6DLT3c1GoQv/pFnVFv5XuEMbnbGTIcjU+5j6eVv0Q=; b=e4vwA0oxnJ6APw/2o3jMcaFodxCxBD//voXHa7wzErQo8LEKs8U9T7JmbD83aJJ0b6 DUJ3G9yhNZUi9MFLiZ9A2gLgsJyGyBK36Gh4gMt9q2qehJM+PdF9NEV0/tyHK11p+Msb kDQmfg3rrOXITKmi/rQ4XNTLANaE3DEDfe0IPcTtGldUB+ZTcomkiaBev8/f/lpwOWg6 tHY40q1GHBl7YQY4oTvCyGPnLXiQ2jScH0RVitCJPnbBBv6eHyFjwGzWKSigQ6xp4WvK nzIpC8f6sTcyloCHVtlzxN1c/swjjrO31K0ISDkW2eE4DfQePCG+6fdq7Ta6KVDVun5K n22w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732887239; x=1733492039; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Er6DLT3c1GoQv/pFnVFv5XuEMbnbGTIcjU+5j6eVv0Q=; b=ECdxRtQvNEW1rsCeMfm0Q4Wm1hUI/L5z6avozKXpjaInK13Z3S7yHZolNRGZsqYPh4 J8P5SiyHtkyO8t8Yr41fvd+U8s6eXG804AXtRctvwMfUC29PU7m7DORme+RoeU1DsbCb xji2YbU+w6I/+nZLSldWcz505lCQF4lB5cs1vKknelZdmxx6bnpoMcf2zUe3EZfOJrV7 hkRy0NlPPbJ0TYWXE5uQXc2orunys9BX1HgQKKHoTnrL3csmDB4WwGETXmwc7CcxFLmt gIni6IEPHj5GXyN2J322QO2Tt3Iu9FfKeVCZMJnlDXADHCIwQB2HeTaRPfQqRSE5GSEg Htbg== X-Gm-Message-State: AOJu0YwoGMohmSmgQq3qM9rQozSF28JO0s/QdzGhiPJqRjm4c7k5LFk6 /ufJCWq9ggUki91iP0BLsODNtlCVCWZTxTHV9nzLSmOkOdZsGXTsoOlRPg== X-Gm-Gg: ASbGncvhxEv8xu4UeVtJ3M4bKqiOf9774o6opz/v36yoy84x5gpqq2tTdiG21gRAuqT Lsb9gHuJPGZorQP5Y2pkx4shbqKnMPBKC/uom8rHA4A/3hgjozMOua/BvBy3VHyzxkty0HMuuPB 67/WcZ9BpaD/LDs9M6F95loTXk75Q7rxrmT3fhTBXLmMRea12ej7cWbOD7TkR6oYpEvbdBh30dH uv+Ty307xnL9stHV69AoW5vfu0dYhS4kPsl4MxtBkN7ZgDmt7CmBYcWWSd86vSh X-Google-Smtp-Source: AGHT+IGwUDwbPgqADq38T5hAP+2gEGz+ZdMZHeJS9vSYXfcluwL37gZLDMazlg1Z4cYiHfD59VskfQ== X-Received: by 2002:a17:906:1da9:b0:aa5:3853:553b with SMTP id a640c23a62f3a-aa5945fb07bmr701521366b.20.1732887239301; Fri, 29 Nov 2024 05:33:59 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa5996c2471sm173996866b.13.2024.11.29.05.33.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:33:58 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 04/18] io_uring/memmap: flag regions with user pages Date: Fri, 29 Nov 2024 13:34:25 +0000 Message-ID: <0dc91564642654405bab080b7ec911cb4a43ec6e.1732886067.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In preparation to kernel allocated regions add a flag telling if the region contains user pinned pages or not. Signed-off-by: Pavel Begunkov --- io_uring/memmap.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/io_uring/memmap.c b/io_uring/memmap.c index 31fb8c8ffe4e..a0416733e921 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -205,12 +205,17 @@ void *__io_uaddr_map(struct page ***pages, unsigned short *npages, enum { /* memory was vmap'ed for the kernel, freeing the region vunmap's it */ IO_REGION_F_VMAP = 1, + /* memory is provided by user and pinned by the kernel */ + IO_REGION_F_USER_PROVIDED = 2, }; void io_free_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr) { if (mr->pages) { - unpin_user_pages(mr->pages, mr->nr_pages); + if (mr->flags & IO_REGION_F_USER_PROVIDED) + unpin_user_pages(mr->pages, mr->nr_pages); + else + release_pages(mr->pages, mr->nr_pages); kvfree(mr->pages); } if ((mr->flags & IO_REGION_F_VMAP) && mr->ptr) @@ -267,7 +272,7 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, mr->pages = pages; mr->ptr = vptr; mr->nr_pages = nr_pages; - mr->flags |= IO_REGION_F_VMAP; + mr->flags |= IO_REGION_F_VMAP | IO_REGION_F_USER_PROVIDED; return 0; out_free: if (pages_accounted) From patchwork Fri Nov 29 13:34:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13888692 Received: from mail-lj1-f178.google.com (mail-lj1-f178.google.com [209.85.208.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 61ABB19D8A9 for ; Fri, 29 Nov 2024 13:34:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887245; cv=none; b=Lf6qlAbaiedDTj9mk7+v3ed/2h3y83Nunm59Csx8BfuQw8ZcoudsV+Rue6Uu0+zJ5zWT5UI3YwM20ff7+q4lNQ6pt/a92ELlSYxWGGUS8Wgyimwjbokt+Wo3Ze8TX85CYSeX4o9D1nfI1/kUDR78yhPabKzTZ53/Gilh8XHyctU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887245; c=relaxed/simple; bh=HiPHB2byif2JmwtD9synNZcUgz3oMf9JozNgG/Za3iE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=a+Ke2UGz2VUoUrGoddHVI8OD6VphzKbzZNshoXO2S+jWMe2XID+kuDSIh/gzBEN0Oh5pPtzuesQETlmyxnmIwVU3EiZMeSxwxfgifRtHQ6vQVEVF751jf2uxbhwVWOiixrxh/XG8SnxC0BgNi+N9PlerJU78NWPW7t4zIzcVjCU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=m0Aw5rFS; arc=none smtp.client-ip=209.85.208.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="m0Aw5rFS" Received: by mail-lj1-f178.google.com with SMTP id 38308e7fff4ca-2ffc3f2b3a9so28132241fa.1 for ; Fri, 29 Nov 2024 05:34:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732887241; x=1733492041; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DubJ6ZhJK+V7sUnHZiHP3s7lIIuPRHa1v63uLBvTUGE=; b=m0Aw5rFS8yQJhCbdkRP1gil0uOP8fcybaabjN/VPqVWy4C/ztnnymvyIiI2b5l2DBU rLIuDdHV3j3x+Fchj/4HGOcAG0Lie5ieN1hm6QOuJCjONqvpOcwwQcE6BxSIqb+WWbgQ EEb6X+td0Sv+ExEJennTzONHwp6rpamj8P9WmlrdXazWxj76J2ASJZEnT6mkQdEkxRuy 8IcIkExmxQDBZXn0NRiPax0sJNGDjtbxyl5792lfA6PGz58Low1iqC8+vwM1p9u2Hdwn ETXbWJif7tIPO5drJ7937o4vzSGvB1C3YaIny3v1sgZHpsZAYOrMJZBhiJ4FTVQqFQfL WzVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732887241; x=1733492041; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DubJ6ZhJK+V7sUnHZiHP3s7lIIuPRHa1v63uLBvTUGE=; b=PAAYaK0ghqCFiktoIEwY7LKVZLaKXd+3Z8szwbKrf8ONJOoEK3eMk2/gCXz09IWtgA xBbz0bsXzWW4s5fNPsoyqusJFjz7XRHqnjDfJldNRWgOCgpwbI4n0styjxFDpMeOXVKi M6z6A5NtAE2dYUzsh8BETpbqPHNkZUVm3uBK6RMgce/SsG35kNqoZmEV8yuKMBhA67s4 bAJpWbWLDVCb22Etq3UvDwbgP/Kkw1zn1mS5ca0XiXElRlPSa+Xc+jW+AT0qJnINiHmm pbxwXWCXb6hqVYY5LO9Vxk2P5XLOpd8D9tJlXACimiArsIAKtLkGt/315i0BvMahYT9U cfWg== X-Gm-Message-State: AOJu0YzOm7wJxY0syRQLsIb4Wc4NIRLkcvwB1cYp9cM5HynpTuTHLwBv +ZW1mc8bFN2xHjhJrgKtGiefYuqHrWKeZWVWhgCIXMWyLOZbyk0jZ/s+iQ== X-Gm-Gg: ASbGncuPfxbGXRvCLrM8hzzDVY+BBuYQI7s8idZxhXWTVZAwkrhM+IfqxKpSewgl3SL lXML9HYEAWbzT/9sxoSPeXk5UZV9xCyyTRat7rpUMvFbCZ+Alnbwf1vLiKriuIM3zGT/RIcWWUf 2mUPHesl3c8ELcUvkUp7YDvAnoR8RT/7vEhhmRISpxjl+htU/FSbDdZ94s6hNh1YFhgWuAMNnNS m6k2vQKZamKNRo2Qlu8/n3IaL0AWKporyxgQMbmP0N03krQYsHp5qPicsC/sswN X-Google-Smtp-Source: AGHT+IFnmtOVGzLVQvqJG24P0xpyIMv5HZK06w7LGm4N+GVsh+rkvXUpYygi/za1wUsvoqVb1khHAA== X-Received: by 2002:a2e:bcc2:0:b0:2ff:95d7:9ed2 with SMTP id 38308e7fff4ca-2ffd60c6f1emr113906611fa.32.1732887240858; Fri, 29 Nov 2024 05:34:00 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa5996c2471sm173996866b.13.2024.11.29.05.33.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:33:59 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 05/18] io_uring/memmap: account memory before pinning Date: Fri, 29 Nov 2024 13:34:26 +0000 Message-ID: <1e242b8038411a222e8b269d35e021fa5015289f.1732886067.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Move memory accounting before page pinning. It shouldn't even try to pin pages if it's not allowed, and accounting is also relatively inexpensive. It also give a better code structure as we do generic accounting and then can branch for different mapping types. Signed-off-by: Pavel Begunkov --- io_uring/memmap.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/io_uring/memmap.c b/io_uring/memmap.c index a0416733e921..fca93bc4c6f1 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -252,17 +252,21 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, if (check_add_overflow(reg->user_addr, reg->size, &end)) return -EOVERFLOW; - pages = io_pin_pages(reg->user_addr, reg->size, &nr_pages); - if (IS_ERR(pages)) - return PTR_ERR(pages); - + nr_pages = reg->size >> PAGE_SHIFT; if (ctx->user) { ret = __io_account_mem(ctx->user, nr_pages); if (ret) - goto out_free; + return ret; pages_accounted = nr_pages; } + pages = io_pin_pages(reg->user_addr, reg->size, &nr_pages); + if (IS_ERR(pages)) { + ret = PTR_ERR(pages); + pages = NULL; + goto out_free; + } + vptr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL); if (!vptr) { ret = -ENOMEM; @@ -277,7 +281,8 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, out_free: if (pages_accounted) __io_unaccount_mem(ctx->user, pages_accounted); - io_pages_free(&pages, nr_pages); + if (pages) + io_pages_free(&pages, nr_pages); return ret; } From patchwork Fri Nov 29 13:34:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13888693 Received: from mail-ej1-f45.google.com (mail-ej1-f45.google.com [209.85.218.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 918D4199E89 for ; Fri, 29 Nov 2024 13:34:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887245; cv=none; b=hTprEHtAMAE3tcsHk79+9aLy5v6iUWS1uKXvHCOqRxuDcSkZeSaX9atOKNUg8E4Jf0oEJnuu3i04HvCy7paculqJvioOtuDrfz4nXTfWw1MGKyYe+2vVWkO8033bSd1zYkXIupeRmeAdmBDcW9fz45zANDdxNMdCxW/dyzkHKwQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887245; c=relaxed/simple; bh=pBLhP4NZ4pOh05aVAZiGGkEMhocl1kPcI5Pr5RxnovM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NYxx8nss2dz0enZsyTXjYzQWSG62qo8TYexKhyzC5iVX01Lm99o/7+pGWsVxX+EtzzwGGlGW+crHW3R+wL6/XmVZrclmdumKX8du5Soy+ppX9JW5C9oNmnEXCQY6PusMFh4AlcP8J6wg+Oeto4qv5sfZzZH/rk7t9S7tJvIhH10= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=E9R+olGs; arc=none smtp.client-ip=209.85.218.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="E9R+olGs" Received: by mail-ej1-f45.google.com with SMTP id a640c23a62f3a-a9e8522445dso237266966b.1 for ; Fri, 29 Nov 2024 05:34:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732887242; x=1733492042; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=sZ7BAkceaUfZ0ygLKPzexSiQI9CZCsSE4oQUbtNBp1s=; b=E9R+olGs1BRtbXo1pKe8l4hqU8MMPp7HPFgHqBAGNuKBajNhRjH5Bjf3kIA5mE/Y1z s0grzXA+oBIbtk9grgVwRb2yRXPu8aqAkHKkkWzJrIcGprkXiT7s05tBAaknwo4bE8CJ tQa2QmLdKZ4omw3ECuE1Gca4SM3KIiqnqH88nkLtj6f8DjVO3iRQCwlRSdMl7iNwowui utAFCL5iY2oGs1H+81HiNsDkQLzEXs6NumL0ahr9M/aFFtIImRVnrGv8AyO/0yKgpQM7 BmFxS0seZZt2Rsl9SV/mVnkDJ+el/FegefC6YJ2284WaK8nsZ3ELd7QahN4lj19MnTvN qaAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732887242; x=1733492042; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sZ7BAkceaUfZ0ygLKPzexSiQI9CZCsSE4oQUbtNBp1s=; b=whityf/jzoznXcdQHCkA3W+MO+lrQdZQsTvq8Ewimhxq1nxfQjuhwTDHn7F9hXDxev 2h50xoBfhQgSG1w0YTyFZreU4eUdX8/td8f0pUxQqLMjSOVfUEVW3o56IvyGxw86gD+C /3NoQ5fkMLnlg7GTIx39Z8zzy65vf+YZV5gPyM7+RKcC248QUmomuVcvSGCGJKCJma/d BKNDFbvgaCDSX2UDbcmB75FvBgOkyYVLCka5w4QmK//5KutF2VWeAA0miIDp2FTtgcHx 7cZzmvBaFuhYqr1TUZiOEftEUnvvJ3Bn+wK02a38uquirTW54NhfngUFcDBHNz2+eDzv Ky6Q== X-Gm-Message-State: AOJu0YyIQfs9tkRPZIHJ0ENTE4k1Tu1bfS8n6pvBxzpLwbKoR7fDrqR3 3/GJ0IDhROLr3/6YRWwbXDdpxwGLFUJx1ESXKbxksENBBVdTQVROiOn/TA== X-Gm-Gg: ASbGncv5Mn9Pq3XHK61Q7fw2mFp5zdiPvUZAT3/2e/Fzm4vxWYGW+lhEIHu3aYQtnRa qM/odXmFnqNCi2bQXQ0qjMQg74LrSMfWWzPVDBo7wPXq057xdmLjkMvdZjw3uI1tn2/ttl3vaIy OgY9rSxEMxXTk2xPeYZ6VTM7TTnwuCo56E6UmiEPyMB4a6pROOMlbxg6uSG+akX0BZSxxR3KEEo a34eeUCKq8Y07mkpcU/1pRVQWDSEQWLRcPUgR6D6vKQq1JfvaOUJzabi3T20hsa X-Google-Smtp-Source: AGHT+IF2vJU6NJbS+0nAllT9v43ZpTqRQ9fh0cQ9UFZzdQu7bvdMUe3D6CySLLxcRKtkh0c7jucb8Q== X-Received: by 2002:a17:907:6ea9:b0:aa5:297a:429f with SMTP id a640c23a62f3a-aa58108036dmr1135156166b.51.1732887241653; Fri, 29 Nov 2024 05:34:01 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa5996c2471sm173996866b.13.2024.11.29.05.34.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:34:01 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 06/18] io_uring/memmap: reuse io_free_region for failure path Date: Fri, 29 Nov 2024 13:34:27 +0000 Message-ID: X-Mailer: git-send-email 2.47.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Regions are going to become more complex with allocation options and optimisations, I want to split initialisation into steps and for that it needs a sane fail path. Reuse io_free_region(), it's smart enough to undo only what's needed and leaves the structure in a consistent state. Signed-off-by: Pavel Begunkov --- io_uring/memmap.c | 16 +++++----------- 1 file changed, 5 insertions(+), 11 deletions(-) diff --git a/io_uring/memmap.c b/io_uring/memmap.c index fca93bc4c6f1..96c4f6b61171 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -229,7 +229,6 @@ void io_free_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr) int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, struct io_uring_region_desc *reg) { - int pages_accounted = 0; struct page **pages; int nr_pages, ret; void *vptr; @@ -257,32 +256,27 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, ret = __io_account_mem(ctx->user, nr_pages); if (ret) return ret; - pages_accounted = nr_pages; } + mr->nr_pages = nr_pages; pages = io_pin_pages(reg->user_addr, reg->size, &nr_pages); if (IS_ERR(pages)) { ret = PTR_ERR(pages); - pages = NULL; goto out_free; } + mr->pages = pages; + mr->flags |= IO_REGION_F_USER_PROVIDED; vptr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL); if (!vptr) { ret = -ENOMEM; goto out_free; } - - mr->pages = pages; mr->ptr = vptr; - mr->nr_pages = nr_pages; - mr->flags |= IO_REGION_F_VMAP | IO_REGION_F_USER_PROVIDED; + mr->flags |= IO_REGION_F_VMAP; return 0; out_free: - if (pages_accounted) - __io_unaccount_mem(ctx->user, pages_accounted); - if (pages) - io_pages_free(&pages, nr_pages); + io_free_region(ctx, mr); return ret; } From patchwork Fri Nov 29 13:34:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13888695 Received: from mail-lf1-f43.google.com (mail-lf1-f43.google.com [209.85.167.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C21551990C1 for ; Fri, 29 Nov 2024 13:34:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887246; cv=none; b=O69sBYDc2ARRASCzKbEwW0/b2tKSXcK6mDB+R28nw5bP9VeOMfqknJZfyYoBMmfH77QAmKfjQ7bVmhHLDJZd4Q44vTpo5kQfcE3uaafVcctKryrsMwUqLh+5KOHJ1wRAY0OEBaYcN2v7P84EjyEuhVVvcBhXgP/E9jy4mI1YoqU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887246; c=relaxed/simple; bh=0HSMXa5ydW7c2+PHnx1xmN1oHJJ2GN/Ij2zM+mEECIA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ondViAQIWMjel3DrlAawLqSAp64epCMwPs7jb/pfjfh1dOk+11b2O8zu/z8GjNZu40uEOQm6PQwsKpPjrEv921KlAZalncEYbngcS2US65bENEOYHF6jNdz7TiK7yr+p01Qyu6R7uygCjTzxX4O00hqk3tpUOG2YFhvlwt1Q1PE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=QTJxMkR+; arc=none smtp.client-ip=209.85.167.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="QTJxMkR+" Received: by mail-lf1-f43.google.com with SMTP id 2adb3069b0e04-53de579f775so2841440e87.2 for ; Fri, 29 Nov 2024 05:34:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732887243; x=1733492043; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=y7DUsvzcDYnCxbLMe5yWIbUXISp/cV61F4ZJFZF0PUI=; b=QTJxMkR+aicmk9AabuPPX1DYtyqYip1nNUAJZCU/3GxGaJ6jiISMoR+kSRxgN9/f49 3M2ZNZfAeeEbVEp7zMKo+asvFQ+VBdumhsSD+tSaXIp6EIpeS3yuR+PD2ovMNdLmuc4R JydtdCiYcET1cBraXi/IeXu7tSN2ZxqEh6yAIPY4jVnceRO8nhHUl4BplhSyT4ldkr0Y A16tHCTudT4FlX5zkV9uoPGP+4qkdCt1OoPyvpvckTsBIkwN8PLppqHrksQD03LMDbeG hF+joAhIP5SMRRu7YgeMd50QOdBPKm5lezSWkVMIh8LUwHsssoyjE9xDM2T91Hfvf73L PR7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732887243; x=1733492043; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=y7DUsvzcDYnCxbLMe5yWIbUXISp/cV61F4ZJFZF0PUI=; b=C1xF8bd9sH1nMDLGzgxSDCzF/zHdcLKv4bhkwX3gXXVxx8gjBesVVAiIMn1kw/62ma mgJMTwjd99vbcBNYerBtC98PLENlmkcl7fcBQC7fDc8PxEV/YSKgBNtJRyXz1TZxvnqx /e47CBO6tKlaUr+BlhxHkDRWA/pYiC58OiwZF9mefi3nAZAinDQEoI153Biv8dABxyeE 3wRok1hMd/Ixjxk0AvrzwaaMqNptixaiD8TvOI4y4AVblLqBaGKHJpsagNHQnk0YJ4EN 7JQmmSqE30YJNGiWocSEFvGs5MR9ZPTF/CAMVdaq8l1frUYgMhJYBweKZPkPlJW/rBJv 4cUQ== X-Gm-Message-State: AOJu0Ywc0j2udEcYJgNZws8VV3BdQjjiZT6BSf7fH7DRZplbrRKAm2fJ E2+X61anDgOPBkOoVf/kbEsMcnO0qU30vsp1tfma0fp5qFk4+/ogNOwqZA== X-Gm-Gg: ASbGnctTJJyx5gZBxLdRwpoGKOCqm6bFBZQEjkmuzZXKLt9vXHzETrgjztMdCOQUDov fnz1aTl9ol05xMf4DtrEPY9lChYU7a44OlGsn1Gr5A6Nl1WtitLHxv4gE664yTMBWySLscaEovw 4PwIWtHbAPX6pR2L6cmUEkVOu0NPcfsZcv/okABoyvshdLzDXdlNYhsACbtDNZGw4QP7CSkzxhQ 0w4XBkU0OnOv6c/u6ABuQAFUTkOqejw3D8q11dycfBC+mh+Y/GvXRY6MlmSxZYL X-Google-Smtp-Source: AGHT+IF40+OtybuLZ3RxOqfCt3cLscbhQoVOgRNJ6yeSzp6q/4Ezf2kngq10/YxeHrZ33Bw5If5Cyw== X-Received: by 2002:a19:9147:0:b0:53d:f0ca:41f2 with SMTP id 2adb3069b0e04-53df0ca4242mr8444911e87.41.1732887242480; Fri, 29 Nov 2024 05:34:02 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa5996c2471sm173996866b.13.2024.11.29.05.34.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:34:01 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 07/18] io_uring/memmap: optimise single folio regions Date: Fri, 29 Nov 2024 13:34:28 +0000 Message-ID: X-Mailer: git-send-email 2.47.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 We don't need to vmap if memory is already physically contiguous. There are two important cases it covers: PAGE_SIZE regions and huge pages. Use io_check_coalesce_buffer() to get the number of contiguous folios. Signed-off-by: Pavel Begunkov --- io_uring/memmap.c | 29 ++++++++++++++++++++++------- 1 file changed, 22 insertions(+), 7 deletions(-) diff --git a/io_uring/memmap.c b/io_uring/memmap.c index 96c4f6b61171..fd348c98f64f 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -226,12 +226,31 @@ void io_free_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr) memset(mr, 0, sizeof(*mr)); } +static int io_region_init_ptr(struct io_mapped_region *mr) +{ + struct io_imu_folio_data ifd; + void *ptr; + + if (io_check_coalesce_buffer(mr->pages, mr->nr_pages, &ifd)) { + if (ifd.nr_folios == 1) { + mr->ptr = page_address(mr->pages[0]); + return 0; + } + } + ptr = vmap(mr->pages, mr->nr_pages, VM_MAP, PAGE_KERNEL); + if (!ptr) + return -ENOMEM; + + mr->ptr = ptr; + mr->flags |= IO_REGION_F_VMAP; + return 0; +} + int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, struct io_uring_region_desc *reg) { struct page **pages; int nr_pages, ret; - void *vptr; u64 end; if (WARN_ON_ONCE(mr->pages || mr->ptr || mr->nr_pages)) @@ -267,13 +286,9 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, mr->pages = pages; mr->flags |= IO_REGION_F_USER_PROVIDED; - vptr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL); - if (!vptr) { - ret = -ENOMEM; + ret = io_region_init_ptr(mr); + if (ret) goto out_free; - } - mr->ptr = vptr; - mr->flags |= IO_REGION_F_VMAP; return 0; out_free: io_free_region(ctx, mr); From patchwork Fri Nov 29 13:34:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13888694 Received: from mail-ej1-f45.google.com (mail-ej1-f45.google.com [209.85.218.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E624219E97F for ; Fri, 29 Nov 2024 13:34:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887246; cv=none; b=cStZfhYpPoTELI9HSMZZ+cLSFRNJmX2l/BufchedsA6ZG0wyrxhsglJV5FlxDxt6TgaZgr5bRMSR3dHJBZQAGjZzuYj+CT2zujMfo3IUgKuckRqUiN469xesSim3bsmA2hoU4SYD75RYP6tp6WPCa0/Pix7go9UwE5jUugT8whc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887246; c=relaxed/simple; bh=pLdERTeJUP2FsZyKU28rGQ/VhRsj4YjeaNLB8EP/JYc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Q8+JHwTUKwHf0vdEO8CrSCK59WXKtBSw5oY9mulWXkqZx7xf/eJPt1b5Y87bnNJeqfFBGKebL0YD43rUrwBaL12Nx3zv5CZcv4VMA0BYaH9ec/UHh0uYxh84+LP2IlZT+yqGIRZpa9DkrRAmo4z9UsI7FkO6EfeBMPHECLU6usc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=g6lIYDjl; arc=none smtp.client-ip=209.85.218.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="g6lIYDjl" Received: by mail-ej1-f45.google.com with SMTP id a640c23a62f3a-aa51bf95ce1so365711566b.3 for ; Fri, 29 Nov 2024 05:34:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732887243; x=1733492043; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Wog/MXIJc5UwLwczDRE7MZ3BX2Dhl+EKIi5M1sjuJ7Q=; b=g6lIYDjlFF4eQvpG8n/tRDKy7nyWgM2c2RHeu6nr4k+Tkfxg48Q673yEqpTl11CUiB BoTEeNsV+AW1RUWWEqhBqatYw1ZtCBKc7e2Say460P7yKA/YJZsc3A7zfXsdi1kEAClQ rBZK48TxmL7nje649JlMS9Z09mMGbUmL0IFbdVubPxO4pKnVccmFXRNFu6XtvJEXxSYS XeETh2K80U9lWOAZ1an/qx9nDAus4ZODE3080sdC1QbQDm/Ba7bJ78cYP2xL5fwrAZqr j/7rg4Sts17YhJeXKX4+Q+rQPLFszzJuasvC8l+67hOWFMAVfnhzsHqKLSkBHHP74hO+ F9zA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732887243; x=1733492043; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Wog/MXIJc5UwLwczDRE7MZ3BX2Dhl+EKIi5M1sjuJ7Q=; b=irA296u7EJs4jlYSRfum764wdGRVggEn/0cDz84buFLWkFV5+h+lezfh0oAXB7N9LV ntDetO9hPRNlZBkYPaTZsQBsxv6wSxuxoumEnGKqhQ/0yS45x5DVk1Mop52XPngT2T33 vNuAJuuEQqrlS8a6muAH1eTfrdA1dQ9HHBL4WIWdyRWtyH0CUXpV9r4xXVJovxFfssT9 U9w93o+qE+0vye9uNmwwJjzJ0TsloFSMjKputz8Xo9/YiE2wvAsccIqJfMmQVzuLcmxn SRjjYin0pVnAuFLzWxkCctuZ6JvOmpW1J4tZBPMsN50m/K5E3jrF3aE9bO8c9owTlim9 f7nw== X-Gm-Message-State: AOJu0YyxSkaSgQ7r5pq0kNVNQAh8a1hHU7n+tZIaO2rUyogHlP5hy64p HXu8Eqj75gU6sez+mENoO+FKu6rzoMhOhZ1bKSzZJ2IpZqXsmHzdmUxyYQ== X-Gm-Gg: ASbGncsn0hf9sVGchwanvehE9R4eLf6B9k5LU87gSKpaMiS/VA7ukfE0eddox3b66y3 zvlPu7NcPTlSdrLDckzsb9Uj38BJRG4lVkNkbkhKAVTx4kfu2eNViJpWL+FXd3phybOdOfGSmGW 2/+X4tGPQ4qiZbZaFyjzfyna3uKQ/Ol97JZMXNUXl7qhSJAMBQQ/5id5442IU2YIiSyRNDOTn4x QOg7gKgkvdd/CCHLXwj3tmVqTcMQ71MhFGx0QBP5PUCNIgACLRMSCSnKQGkpKJF X-Google-Smtp-Source: AGHT+IFCipBuzDfYP4y0z6LqzCmNaUMiihBaKRzq1o95NYizjq1xgmPUNGgnb+PLdSjMKxq3dpIOpw== X-Received: by 2002:a17:906:3d29:b0:aa5:63a1:17cf with SMTP id a640c23a62f3a-aa580f23e33mr1138961066b.20.1732887243012; Fri, 29 Nov 2024 05:34:03 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa5996c2471sm173996866b.13.2024.11.29.05.34.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:34:02 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 08/18] io_uring/memmap: helper for pinning region pages Date: Fri, 29 Nov 2024 13:34:29 +0000 Message-ID: X-Mailer: git-send-email 2.47.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In preparation to adding kernel allocated regions extract a new helper that pins user pages. Signed-off-by: Pavel Begunkov --- io_uring/memmap.c | 28 +++++++++++++++++++++------- 1 file changed, 21 insertions(+), 7 deletions(-) diff --git a/io_uring/memmap.c b/io_uring/memmap.c index fd348c98f64f..5d261e07c2e3 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -246,10 +246,28 @@ static int io_region_init_ptr(struct io_mapped_region *mr) return 0; } +static int io_region_pin_pages(struct io_ring_ctx *ctx, + struct io_mapped_region *mr, + struct io_uring_region_desc *reg) +{ + unsigned long size = mr->nr_pages << PAGE_SHIFT; + struct page **pages; + int nr_pages; + + pages = io_pin_pages(reg->user_addr, size, &nr_pages); + if (IS_ERR(pages)) + return PTR_ERR(pages); + if (WARN_ON_ONCE(nr_pages != mr->nr_pages)) + return -EFAULT; + + mr->pages = pages; + mr->flags |= IO_REGION_F_USER_PROVIDED; + return 0; +} + int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, struct io_uring_region_desc *reg) { - struct page **pages; int nr_pages, ret; u64 end; @@ -278,13 +296,9 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, } mr->nr_pages = nr_pages; - pages = io_pin_pages(reg->user_addr, reg->size, &nr_pages); - if (IS_ERR(pages)) { - ret = PTR_ERR(pages); + ret = io_region_pin_pages(ctx, mr, reg); + if (ret) goto out_free; - } - mr->pages = pages; - mr->flags |= IO_REGION_F_USER_PROVIDED; ret = io_region_init_ptr(mr); if (ret) From patchwork Fri Nov 29 13:34:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13888696 Received: from mail-ed1-f52.google.com (mail-ed1-f52.google.com [209.85.208.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C83CE199E89 for ; Fri, 29 Nov 2024 13:34:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887247; cv=none; b=cpzapsnw5ycnnZ1tMyOeDvMVPI0ToRqfiazp6bLLrMkWBbyBswy92oCcGtoHVFc7qQcgYo4orx21RfstV2Pb/2AGmryyok+i31INYyvAYE/D3moeI6tiDibZ0ah3ZJpbV4m3wrrlxnZu3bQr/ut+Ihz/yJbQ9muF3v8ZIqtJVlI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887247; c=relaxed/simple; bh=xG5BJVMYMK/ic6yBusHqEDPDk8SnMuUQSGJz4Wz7DsA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=q0yO+5co3nUnX+l4lmGDRO/dPsKeIGQEMme5FnmthUTB3hs4Eh/6LAJvGEeFYRf1xIpaTL83a/yvA4YWz9YIQ5WKAINWDio9d0oabW+ZAYF8dPpocqrh2VH7so+tR770WD648DQVkUKzGi3/5ZNZHat6114gKfKH95zq1ht/WTo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=VwgNRa11; arc=none smtp.client-ip=209.85.208.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="VwgNRa11" Received: by mail-ed1-f52.google.com with SMTP id 4fb4d7f45d1cf-5d0be79e7e7so358355a12.0 for ; Fri, 29 Nov 2024 05:34:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732887244; x=1733492044; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iw9KjbDCnQd55jzzAKjgJBEKh2EnMTZarOWnv7TTLMc=; b=VwgNRa11fhlySS+d6AfClKv9gWStu8MqPRbxiWIfavRc1srA2PWWu2ldeKXdd2vlvM bUDFTpr/t3KrYEduDyxJd0VeggSzbfJuXFRvV5c4XrTJBIYDmnrj170zjsMWizF14XMx rc57FXOGXZN/n5gmAjt8U53yWBhJlLH4zoLgAlxoigTYgzIj0wfMBXizaT/KSdm3cuil zAgA/Gzh/WAaG/odUycHWZc9bKi/te51n0Rl+9VVuXDAtTDH0/olD+P5t0iHc0wJxWi+ fRuiZ9glhwM07o/GaeId8sCXZZm6A6F7U/U9F+FLhLaZbXW+Js/Q0pln7zk8OFb1gU1D w1Rw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732887244; x=1733492044; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iw9KjbDCnQd55jzzAKjgJBEKh2EnMTZarOWnv7TTLMc=; b=vBWPg01ibQOo09AEGvC0NegSs4QKcWHzbqsYI+H6gT4ruPP5kQdYczauQHZkHNa1MW x6QYIgVtWKxfJRQS+g2pxrwRBTyi8Bxlde16odau7Y/k97Muqs29BV3buRPcVjukBmcT AJiW3QR5ZXRyc4+qnRiblsiSm/rf1UHEuePUFX1mu4B0TkHvLLSNxAnXoT3hw+g4CV0g 8eWaHnyWz+K/IOD2v5IQXuzpRzL1o3gXxBOT8L31cMNjU8/Q24vTHCSgx0IyL9RwBKed 8z6jzdwGY3/S8Pg0XUPc/Et6CCKH992zXh0AeA6vDGdNnUHO2uaWiliWlnmyEma8tOAh uEyg== X-Gm-Message-State: AOJu0Yx1uB9kry2UlD/zWC8GC+zikgjc/5lpnY6FYdVOkjq8i9sOblqu 7drTQ0edbPyfpwU3yz9vWfdpvxwOQYXvfjODqgAnoMokgur2z+bT0BuLUQ== X-Gm-Gg: ASbGncsRgTI5y1D3GyH99lXV2byatou2HxCAwK2B8TONM6EZr+eHRB9NyUnfeldwdF/ U/i9zshBVN/lzIqclDA+rbbQPU8y6MccJZb9bAVAq9I5vaxzhcYzttSItn9He/57wdN/eWX8iqV UkNDByI0AFDfUHPJ/FG752oPoksqmdWdSWyqgtrUxbXNQxlA2M6wGvhg3vM0Cp0BFr2opKZSRSy pMO3KycVRvm1ODXhpQJxqy7o188mdK+rH/wQeJs1nIyuYyNKVq5Dd76ZxUriby4 X-Google-Smtp-Source: AGHT+IHsNRqZe14BbrAzNFA6CcOOUCUjd3bjNThTeH2sfPKu1tChv9c03Sz46qBtK80/CZI+d1DTOQ== X-Received: by 2002:a17:906:3d29:b0:aa5:1699:e25a with SMTP id a640c23a62f3a-aa580ee98eamr809780966b.10.1732887243828; Fri, 29 Nov 2024 05:34:03 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa5996c2471sm173996866b.13.2024.11.29.05.34.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:34:03 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 09/18] io_uring/memmap: add IO_REGION_F_SINGLE_REF Date: Fri, 29 Nov 2024 13:34:30 +0000 Message-ID: X-Mailer: git-send-email 2.47.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Kernel allocated compound pages will have just one reference for the entire page array, add a flag telling io_free_region about that. Signed-off-by: Pavel Begunkov --- io_uring/memmap.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/io_uring/memmap.c b/io_uring/memmap.c index 5d261e07c2e3..a37ccb167258 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -207,15 +207,23 @@ enum { IO_REGION_F_VMAP = 1, /* memory is provided by user and pinned by the kernel */ IO_REGION_F_USER_PROVIDED = 2, + /* only the first page in the array is ref'ed */ + IO_REGION_F_SINGLE_REF = 4, }; void io_free_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr) { if (mr->pages) { + long nr_refs = mr->nr_pages; + + if (mr->flags & IO_REGION_F_SINGLE_REF) + nr_refs = 1; + if (mr->flags & IO_REGION_F_USER_PROVIDED) - unpin_user_pages(mr->pages, mr->nr_pages); + unpin_user_pages(mr->pages, nr_refs); else - release_pages(mr->pages, mr->nr_pages); + release_pages(mr->pages, nr_refs); + kvfree(mr->pages); } if ((mr->flags & IO_REGION_F_VMAP) && mr->ptr) From patchwork Fri Nov 29 13:34:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13888697 Received: from mail-ej1-f54.google.com (mail-ej1-f54.google.com [209.85.218.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8BE2019CCEC for ; Fri, 29 Nov 2024 13:34:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887248; cv=none; b=S4CqTOkeu0mxXDRcRZJu14E1Yk64OOQYzdSETipWUVYa0J3VJ+GCIeVbhz8B8oPbYETT8Yp04EKx6Uaub/ntnMuaGNP/rXSaOd731KrCk4/wbm9XxKNFBfB6egoSAclt4nBBWIOH3bAlmzPnRtfLrs+r+K+qXwurl+Q05i8ITvc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887248; c=relaxed/simple; bh=6Bd/a5v9AEqg0j98Oe5um3EMkG0fzT+YgEbObd2aOrE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fhVT1QQwUL6sWUFJY/zuV3V5WSYOAuhDPYS4hVlifVkmordchwcJU5AOrBaTdWrb/AFRl43GsC0tQfc1jGqYaiLVYNjabdiEdjtZjJ09gCyqY0rrRkXG54nBgiWiB5ps4BsIaZAcl+O4apxDXrRiAp4TuigFRTRutftlOs31SKc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=b26mwhFN; arc=none smtp.client-ip=209.85.218.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="b26mwhFN" Received: by mail-ej1-f54.google.com with SMTP id a640c23a62f3a-a9a68480164so229043466b.3 for ; Fri, 29 Nov 2024 05:34:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732887245; x=1733492045; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/UQKdyTiZH5uaJ1y0f+rSObixzhXikWpmqvSjDY1JnE=; b=b26mwhFNb+hYzv+hz6BzCyjb/XCRrsAQ0Z8voSogolDYXftVa501E+nS7C4bn7tD6/ nIjCQUx8v9bOimRShXMBGhRkzS0pLH2NQ594WnIb1M0nAiHAiYwr0kFCNtQ8daLjMWqS ZBy7rsviP5aR2y7nRD6Sa10NkW6tyyQwtuIC2A2aR4Bpk6iLOkiOjyUyf0G2nqSilZ7C iRPnEK0a5qujA+4K5FJJEMKepb3DrbAIFl5ROR4CEtpMgNRQBKo9SXQjSYdabF8BCp7p A1WODe2lSB6g9MrMjVp0L84xXq5B+TA3Inl92RhqvDs6Pu4oreCrHpb6h4gMz+Pugp1q bedQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732887245; x=1733492045; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/UQKdyTiZH5uaJ1y0f+rSObixzhXikWpmqvSjDY1JnE=; b=Spo5OofvGlOhuRb/mcE4QHa5D8H4DsTf7QLm42emYAbO1Kk1oiel0Z/kKGyYpmdWMP xl1fJAtdI05PUSUE7rRm8JhQ6eO0zgC89Gd6pi8301nuDiM5Am+Qs/vNyGYccjvfOREt tyy/OD+iGSbl8E+YyLAblsuTTSmFxsutSAdZgrIWVaZYMI5AC/0kzQQbpcJgmwQ3E1cT NERlSFLJgwOSraREBKXPMSXPMP4fLZZSh7GOADoUSkgd0xF7YOZq2/JsQ45IvoInFv1P SFuvDB2gT/j6AbSRyT+6rno4eop7geK7o0n6CsCDeuDwPscnLrVGoHhIM8IB3mn51n9l J30Q== X-Gm-Message-State: AOJu0YyUSDreqYGsjZXstJ1YAdz8Q59KvURU39sQ+zs5+4P5fFxb2q+t htMhr2VRO60zA6SaaQ/+Suv9b/DnK+jENwjXmU8BTDSoVa7dlwh3M3E7Hg== X-Gm-Gg: ASbGnct0ZhE0t/AMVu+oJo6T9AyqF/9rQGf3P5bGcv3ZC/so5AZBAMK1FeXZEMgn4it WYpGQkMY7Ob3MnxOowzzRAptt371Ej7T1sA3dxhgWXPI0qY8rOoy6Jwu/n6UlKnz/Kd8XDHDaAU /6oKV4HmY7BsMa0f4ed2zecDN0X3mKopfyq61gAuDIJueB8GgKjQeQ62zB4DwLLkmD8l8LxZM1C D+QDRvlhwJFyOIjV2iqbTFDaLCNtD9lz18E9wtrzfYR1A5JIkrGcV4YeQyHvuam X-Google-Smtp-Source: AGHT+IGmYMtMZQGdhO76urFlW/rbRxMhM/uSoEmMs8DAg49ruRtY7G8dZnYnShfOk1DW9C4Gq1CKPg== X-Received: by 2002:a17:906:3d29:b0:aa4:8186:4e93 with SMTP id a640c23a62f3a-aa580ef37f6mr901087166b.1.1732887244445; Fri, 29 Nov 2024 05:34:04 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa5996c2471sm173996866b.13.2024.11.29.05.34.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:34:04 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 10/18] io_uring/memmap: implement kernel allocated regions Date: Fri, 29 Nov 2024 13:34:31 +0000 Message-ID: <7b8c40e6542546bbf93f4842a9a42a7373b81e0d.1732886067.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Allow the kernel to allocate memory for a region. That's the classical way SQ/CQ are allocated. It's not yet useful to user space as there is no way to mmap it, which is why it's explicitly disabled in io_register_mem_region(). Signed-off-by: Pavel Begunkov --- io_uring/memmap.c | 43 ++++++++++++++++++++++++++++++++++++++++--- io_uring/register.c | 2 ++ 2 files changed, 42 insertions(+), 3 deletions(-) diff --git a/io_uring/memmap.c b/io_uring/memmap.c index a37ccb167258..0908a71bf57e 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -273,6 +273,39 @@ static int io_region_pin_pages(struct io_ring_ctx *ctx, return 0; } +static int io_region_allocate_pages(struct io_ring_ctx *ctx, + struct io_mapped_region *mr, + struct io_uring_region_desc *reg) +{ + gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_ZERO | __GFP_NOWARN; + unsigned long size = mr->nr_pages << PAGE_SHIFT; + unsigned long nr_allocated; + struct page **pages; + void *p; + + pages = kvmalloc_array(mr->nr_pages, sizeof(*pages), gfp); + if (!pages) + return -ENOMEM; + + p = io_mem_alloc_compound(pages, mr->nr_pages, size, gfp); + if (!IS_ERR(p)) { + mr->flags |= IO_REGION_F_SINGLE_REF; + mr->pages = pages; + return 0; + } + + nr_allocated = alloc_pages_bulk_array_node(gfp, NUMA_NO_NODE, + mr->nr_pages, pages); + if (nr_allocated != mr->nr_pages) { + if (nr_allocated) + release_pages(pages, nr_allocated); + kvfree(pages); + return -ENOMEM; + } + mr->pages = pages; + return 0; +} + int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, struct io_uring_region_desc *reg) { @@ -283,9 +316,10 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, return -EFAULT; if (memchr_inv(®->__resv, 0, sizeof(reg->__resv))) return -EINVAL; - if (reg->flags != IORING_MEM_REGION_TYPE_USER) + if (reg->flags & ~IORING_MEM_REGION_TYPE_USER) return -EINVAL; - if (!reg->user_addr) + /* user_addr should be set IFF it's a user memory backed region */ + if ((reg->flags & IORING_MEM_REGION_TYPE_USER) != !!reg->user_addr) return -EFAULT; if (!reg->size || reg->mmap_offset || reg->id) return -EINVAL; @@ -304,7 +338,10 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, } mr->nr_pages = nr_pages; - ret = io_region_pin_pages(ctx, mr, reg); + if (reg->flags & IORING_MEM_REGION_TYPE_USER) + ret = io_region_pin_pages(ctx, mr, reg); + else + ret = io_region_allocate_pages(ctx, mr, reg); if (ret) goto out_free; diff --git a/io_uring/register.c b/io_uring/register.c index ba61697d7a53..f043d3f6b026 100644 --- a/io_uring/register.c +++ b/io_uring/register.c @@ -586,6 +586,8 @@ static int io_register_mem_region(struct io_ring_ctx *ctx, void __user *uarg) if (copy_from_user(&rd, rd_uptr, sizeof(rd))) return -EFAULT; + if (!(rd.flags & IORING_MEM_REGION_TYPE_USER)) + return -EINVAL; if (memchr_inv(®.__resv, 0, sizeof(reg.__resv))) return -EINVAL; if (reg.flags & ~IORING_MEM_REGION_REG_WAIT_ARG) From patchwork Fri Nov 29 13:34:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13888698 Received: from mail-lj1-f179.google.com (mail-lj1-f179.google.com [209.85.208.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A7A5619F116 for ; Fri, 29 Nov 2024 13:34:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887249; cv=none; b=bYnbVOyYRU4+optBClrZ/Tm8EPhNdKZoSUxEncqw82eD/C2AHJ8Msx2hsMV2IxftKkFs2lzXMkDh8FfyCCbB9vq+9akz2kCV366UT5p81daB44gGaMt37xly1BbnZm9pysvB2FD+XfLg4eyl8KCqPJWKNr904GdVbhij8rpUyII= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887249; c=relaxed/simple; bh=wKX6y/XtPrvy8vACwrcBMbcIDCOZ27NZJ10UmQzz8eA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WPkNN7hyQo87NFICY39oo075ZuTAMhH7xDwZZPRpv8rj6yoXx/PxOSC763CULReMwtWqcT+KQYy2nm6v17WF1FKB2yylhSGGYZrJdiNRKvFR1dMHTuqiBCcmnVLe4Kt1Nb751TaAkXn8l7GjHE//ssclPwZXWO9lY6EjKRsPpx4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=edU3WGXQ; arc=none smtp.client-ip=209.85.208.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="edU3WGXQ" Received: by mail-lj1-f179.google.com with SMTP id 38308e7fff4ca-2ffc76368c6so31586231fa.0 for ; Fri, 29 Nov 2024 05:34:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732887246; x=1733492046; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=93TwdHJhygGmeY47BMqX93X9f1o8prnSPfQiuR9K/Y0=; b=edU3WGXQ2w6K0xzcx9rYH4q1dBpofvrNqRMFf+qmmsqjZPscRh2QHGqBSRVjA32u+K 84JARYZqm7UKkcfNt/IBONWC6YetLPiydW92OVKzD337cZVDxqUIyCFi9bte4eJD2Pvk 69vEczI6B+k1ND2GBiQTsSv1GQ3+JQB2l6VLSRe3MHjlhTf5s2cq4T8EXvrm65ICv0Iv sP4UGFtPmqVJ0o9EfU3Tq38hIA2KGNnOIFDFgeT95Q4Pklp6HtX0tTBdYhKZxDyycIVf 5zzyPt9YcvaPD+6dxCOOizq/Mq22xaLsqyceZMwxX7dsdMW8EpgEdh5wg8AdGOPCYi0K WP6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732887246; x=1733492046; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=93TwdHJhygGmeY47BMqX93X9f1o8prnSPfQiuR9K/Y0=; b=btZBCqYp23ls/YDKwG8YBZWANU5rKq+FHZownVtYZ862U7NnnjP/EWIKulU8udl4Nj PJ7BXkLtk16aZYVvte9EveLv9RnHWRGTepDn7haB0hsk4L0XPCsII1PZCK4WDNKi0Vjt u0WSeIMB6wCDTd8QzodrE2RY7vP4e1TohmQamPVmO20cZibT/hLCw+jBqN/TXjeBSBeX tL5bHGEeQxONfkx4UcfqG/CcNpYazIFo2jtHfeyIKF0GWidG+Z1trFq/rjG3DYrEqaHC 2jMDah88nlwYZZ7D40o0T8r6Aey1t3pATbPGEDHRdwTJq4gXTP1M2j5O4W7swHGDD9m6 ZaaQ== X-Gm-Message-State: AOJu0YyCEsd8Q5igdcaPFD5sVKFmCOYvexS9vDAqDd8Bq6+Cv1YsdYOl CXlgpAMZoasaNNOuT9edkbtecNSz1Ek43eZWKj2Yocuq+AgXOnyFHD9YDw== X-Gm-Gg: ASbGncv1j5YmJDljyXpjRpGW5exXhg012Z1t4QT4TGegj0jdjZ93IkHALOOoW9YKM6b 4x+tjBP9wsjr6Vx6Dpi+adAf52KGaqfQxJVBa2eq5oru8W+yuFywXKQHbs4Unpr4C2MLQdqafVJ J7U67EhcDFTpclITSGZzXPNBqP5w5oFwsAq5+xpzNJS4G+xhEnWNFsB8kyymmhcultS52VqfvPR SmC5XbSb5x7G1slUMh5hr7lSpZCMvifzWED1msll9suDetfIpJ3QDkRPlJm2GMU X-Google-Smtp-Source: AGHT+IGvMwT6Bv37nnGEuqRgk7fg2Z1/vOIk53vsHxPXvbN+RssyOVM+Z0tfukRq4hnH3E+tx5zyTQ== X-Received: by 2002:ac2:4e04:0:b0:53d:cbaa:86f5 with SMTP id 2adb3069b0e04-53df00ccbb8mr9740078e87.13.1732887245060; Fri, 29 Nov 2024 05:34:05 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa5996c2471sm173996866b.13.2024.11.29.05.34.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:34:04 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 11/18] io_uring/memmap: implement mmap for regions Date: Fri, 29 Nov 2024 13:34:32 +0000 Message-ID: <0f1212bd6af7fb39b63514b34fae8948014221d1.1732886067.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The patch implements mmap for the param region and enables the kernel allocation mode. Internally it uses a fixed mmap offset, however the user has to use the offset returned in struct io_uring_region_desc::mmap_offset. Note, mmap doesn't and can't take ->uring_lock and the region / ring lookup is protected by ->mmap_lock, and it's directly peeking at ctx->param_region. We can't protect io_create_region() with the mmap_lock as it'd deadlock, which is why io_create_region_mmap_safe() initialises it for us in a temporary variable and then publishes it with the lock taken. It's intentionally decoupled from main region helpers, and in the future we might want to have a list of active regions, which then could be protected by the ->mmap_lock. Signed-off-by: Pavel Begunkov --- io_uring/memmap.c | 61 +++++++++++++++++++++++++++++++++++++++++---- io_uring/memmap.h | 10 +++++++- io_uring/register.c | 6 ++--- 3 files changed, 67 insertions(+), 10 deletions(-) diff --git a/io_uring/memmap.c b/io_uring/memmap.c index 0908a71bf57e..9a182c8a4be1 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -275,7 +275,8 @@ static int io_region_pin_pages(struct io_ring_ctx *ctx, static int io_region_allocate_pages(struct io_ring_ctx *ctx, struct io_mapped_region *mr, - struct io_uring_region_desc *reg) + struct io_uring_region_desc *reg, + unsigned long mmap_offset) { gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_ZERO | __GFP_NOWARN; unsigned long size = mr->nr_pages << PAGE_SHIFT; @@ -290,8 +291,7 @@ static int io_region_allocate_pages(struct io_ring_ctx *ctx, p = io_mem_alloc_compound(pages, mr->nr_pages, size, gfp); if (!IS_ERR(p)) { mr->flags |= IO_REGION_F_SINGLE_REF; - mr->pages = pages; - return 0; + goto done; } nr_allocated = alloc_pages_bulk_array_node(gfp, NUMA_NO_NODE, @@ -302,12 +302,15 @@ static int io_region_allocate_pages(struct io_ring_ctx *ctx, kvfree(pages); return -ENOMEM; } +done: + reg->mmap_offset = mmap_offset; mr->pages = pages; return 0; } int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, - struct io_uring_region_desc *reg) + struct io_uring_region_desc *reg, + unsigned long mmap_offset) { int nr_pages, ret; u64 end; @@ -341,7 +344,7 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, if (reg->flags & IORING_MEM_REGION_TYPE_USER) ret = io_region_pin_pages(ctx, mr, reg); else - ret = io_region_allocate_pages(ctx, mr, reg); + ret = io_region_allocate_pages(ctx, mr, reg, mmap_offset); if (ret) goto out_free; @@ -354,6 +357,40 @@ int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, return ret; } +int io_create_region_mmap_safe(struct io_ring_ctx *ctx, struct io_mapped_region *mr, + struct io_uring_region_desc *reg, + unsigned long mmap_offset) +{ + struct io_mapped_region tmp_mr; + int ret; + + memcpy(&tmp_mr, mr, sizeof(tmp_mr)); + ret = io_create_region(ctx, &tmp_mr, reg, mmap_offset); + if (ret) + return ret; + + /* + * Once published mmap can find it without holding only the ->mmap_lock + * and not ->uring_lock. + */ + guard(mutex)(&ctx->mmap_lock); + memcpy(mr, &tmp_mr, sizeof(tmp_mr)); + return 0; +} + +static void *io_region_validate_mmap(struct io_ring_ctx *ctx, + struct io_mapped_region *mr) +{ + lockdep_assert_held(&ctx->mmap_lock); + + if (!io_region_is_set(mr)) + return ERR_PTR(-EINVAL); + if (mr->flags & IO_REGION_F_USER_PROVIDED) + return ERR_PTR(-EINVAL); + + return io_region_get_ptr(mr); +} + static void *io_uring_validate_mmap_request(struct file *file, loff_t pgoff, size_t sz) { @@ -389,6 +426,8 @@ static void *io_uring_validate_mmap_request(struct file *file, loff_t pgoff, io_put_bl(ctx, bl); return ptr; } + case IORING_MAP_OFF_PARAM_REGION: + return io_region_validate_mmap(ctx, &ctx->param_region); } return ERR_PTR(-EINVAL); @@ -405,6 +444,16 @@ int io_uring_mmap_pages(struct io_ring_ctx *ctx, struct vm_area_struct *vma, #ifdef CONFIG_MMU +static int io_region_mmap(struct io_ring_ctx *ctx, + struct io_mapped_region *mr, + struct vm_area_struct *vma) +{ + unsigned long nr_pages = mr->nr_pages; + + vm_flags_set(vma, VM_DONTEXPAND); + return vm_insert_pages(vma, vma->vm_start, mr->pages, &nr_pages); +} + __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) { struct io_ring_ctx *ctx = file->private_data; @@ -429,6 +478,8 @@ __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) ctx->n_sqe_pages); case IORING_OFF_PBUF_RING: return io_pbuf_mmap(file, vma); + case IORING_MAP_OFF_PARAM_REGION: + return io_region_mmap(ctx, &ctx->param_region, vma); } return -EINVAL; diff --git a/io_uring/memmap.h b/io_uring/memmap.h index 2096a8427277..2402bca3d700 100644 --- a/io_uring/memmap.h +++ b/io_uring/memmap.h @@ -1,6 +1,8 @@ #ifndef IO_URING_MEMMAP_H #define IO_URING_MEMMAP_H +#define IORING_MAP_OFF_PARAM_REGION 0x20000000ULL + struct page **io_pin_pages(unsigned long ubuf, unsigned long len, int *npages); void io_pages_free(struct page ***pages, int npages); int io_uring_mmap_pages(struct io_ring_ctx *ctx, struct vm_area_struct *vma, @@ -24,7 +26,13 @@ int io_uring_mmap(struct file *file, struct vm_area_struct *vma); void io_free_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr); int io_create_region(struct io_ring_ctx *ctx, struct io_mapped_region *mr, - struct io_uring_region_desc *reg); + struct io_uring_region_desc *reg, + unsigned long mmap_offset); + +int io_create_region_mmap_safe(struct io_ring_ctx *ctx, + struct io_mapped_region *mr, + struct io_uring_region_desc *reg, + unsigned long mmap_offset); static inline void *io_region_get_ptr(struct io_mapped_region *mr) { diff --git a/io_uring/register.c b/io_uring/register.c index f043d3f6b026..5b099ec36d00 100644 --- a/io_uring/register.c +++ b/io_uring/register.c @@ -585,9 +585,6 @@ static int io_register_mem_region(struct io_ring_ctx *ctx, void __user *uarg) rd_uptr = u64_to_user_ptr(reg.region_uptr); if (copy_from_user(&rd, rd_uptr, sizeof(rd))) return -EFAULT; - - if (!(rd.flags & IORING_MEM_REGION_TYPE_USER)) - return -EINVAL; if (memchr_inv(®.__resv, 0, sizeof(reg.__resv))) return -EINVAL; if (reg.flags & ~IORING_MEM_REGION_REG_WAIT_ARG) @@ -602,7 +599,8 @@ static int io_register_mem_region(struct io_ring_ctx *ctx, void __user *uarg) !(ctx->flags & IORING_SETUP_R_DISABLED)) return -EINVAL; - ret = io_create_region(ctx, &ctx->param_region, &rd); + ret = io_create_region_mmap_safe(ctx, &ctx->param_region, &rd, + IORING_MAP_OFF_PARAM_REGION); if (ret) return ret; if (copy_to_user(rd_uptr, &rd, sizeof(rd))) { From patchwork Fri Nov 29 13:34:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13888699 Received: from mail-ej1-f49.google.com (mail-ej1-f49.google.com [209.85.218.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EC631199E89 for ; Fri, 29 Nov 2024 13:34:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887249; cv=none; b=fu2gfy1wtHELEd7h6bg6iU+8bQbqJnJLMFM9hl2ItB+TJwR69OK1HyZeqZfnF3BRaHSZQvcYmRc3lNqVb6LK48SbCTdHypu4yCCeHiN2xaABUdvJuGXYIJYTFh8HMom48kf031d+X3RMqNMZ5kQSRIVwmhJHP5BkNDHQs//lqwA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887249; c=relaxed/simple; bh=iqfLB8sficjS/V1C04y/Ou0SPGoNKMo8hL1DOmsxXA8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ksi1awDABaRKEkBHmLu1phS7kAEtzCxj/C9R2jFleZKEIPIp2hX4m3LInn3RJDboTPYF8Z9W3yy1pYFbk10ibjH/LcD9RAZWQpbL7MJl2hga44FsBUms43QEnqtLsjC5DjLv42axTdg8PEI3rlgqlM+9l4yC1qt7n3TWY2mF17A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=KqQruVIA; arc=none smtp.client-ip=209.85.218.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="KqQruVIA" Received: by mail-ej1-f49.google.com with SMTP id a640c23a62f3a-a9a0ec0a94fso230067766b.1 for ; Fri, 29 Nov 2024 05:34:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732887246; x=1733492046; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KpTM6PFXgXiDtw0Mod4ajmUlWu7jNN1si4S9/CIBNRg=; b=KqQruVIA+IOBSQolFTq+9GjsGD7rM8lJY9EJAP/HQ7NpoXS5PZahcL3ObhXU0f+2a0 ipROwUsrNEQ1yO1aTPTN+TZOlCPSfU4/mZvYkjk1bKp9LQa2lNEtNRVvMATKVwX8mFYO k79cl6lPrKDavOGoNyYjzrWYjUkE55NW0yxI9o98kXHe6O5pxJ+XWmgBwx5s77figiBb 3ZJbfcdGTLX+aZr8nmwm0P7SDFfigFWaD2hd3jT6TYnKXiC7wmnPxvj6x5ccdx0l+plR lg7nSebYuqATYqwZCxiWcf6j1B3jJOciZ1Hvgmx9ZBiLxqWDlo+5AOeJjCx6za2prR1m 2Vuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732887246; x=1733492046; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KpTM6PFXgXiDtw0Mod4ajmUlWu7jNN1si4S9/CIBNRg=; b=CCnBSqZ5ceXNZunQc1iuWOu9yWLmMyDbaK5hXl0vM8uG98C2Ow4W0q3LXxYA90ax8h sx8FVakQJFBGAC0ycfSaO4VAn1NieoFX5Rf5yNLqkAwK7cU1qQCgJqo01Fre2yjStskS Vzt8xJeREem91wD5dW6i6kGUEjG9CdLsx16zglwu1ja7OdJK4bLnsqK8zOSOBkKjOZts FmSCFhfJOB+K4Hsxm8/+y9SLfgWmZJyF1kpp4CjZ4bK26VecwEY6RH2jwnfFx1hHH5/V R780so/xJAhOxGWXUYFgFigy43mmAUuOhiA7SFZKxgE+8UGb98becPTyc36/npOVC7GS dh5A== X-Gm-Message-State: AOJu0YxEGWYGxhDAYSwAvUKGpzw7XmJ6vAA6cXxrON+xftlmzhdjDRaQ j871BeA3YOajPbyjxBYPTTuKXpIABYV0Rq0JL7+0tXP3LkHOPreJc2bpXQ== X-Gm-Gg: ASbGncs4mJ2nLeT5Rs6Bdzh9sjNCRfb2KWdHG/yWrlnb2i19oZYqtofISfpfRvFcHXj Cq9y+dhrI6zNwOIVA4oT8YwtvlGz0QQlgfb0igzi7g8UqBK4yXVKBIqKJDnX56isvsOoz9byxbs 6RWHYTbVMjJQaww7HWZtCjjy6jbqru3/zJ6AOrr8amhIaxGT2xFXcdpMH8xU50hBwirpd8TsMeG kLZzEG+xJniwn77cFrb9GaG/6DGmZDGnAZPG+MFgA74GmQUZeIWffKsk92AxNyt X-Google-Smtp-Source: AGHT+IE6siFo1oMQ5z2s5n2TnNosIgHmxu1kOmt+oxSesA1vEM2wGQKBK1+W0JyyztNsHaX4+VU0+A== X-Received: by 2002:a17:906:3cb1:b0:aa5:3663:64ba with SMTP id a640c23a62f3a-aa58104414bmr902482866b.43.1732887245676; Fri, 29 Nov 2024 05:34:05 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa5996c2471sm173996866b.13.2024.11.29.05.34.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:34:05 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 12/18] io_uring: pass ctx to io_register_free_rings Date: Fri, 29 Nov 2024 13:34:33 +0000 Message-ID: X-Mailer: git-send-email 2.47.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 A preparation patch, pass the context to io_register_free_rings. Signed-off-by: Pavel Begunkov --- io_uring/register.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/io_uring/register.c b/io_uring/register.c index 5b099ec36d00..5e07205fb071 100644 --- a/io_uring/register.c +++ b/io_uring/register.c @@ -375,7 +375,8 @@ struct io_ring_ctx_rings { struct io_rings *rings; }; -static void io_register_free_rings(struct io_uring_params *p, +static void io_register_free_rings(struct io_ring_ctx *ctx, + struct io_uring_params *p, struct io_ring_ctx_rings *r) { if (!(p->flags & IORING_SETUP_NO_MMAP)) { @@ -452,7 +453,7 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg) n.rings->cq_ring_entries = p.cq_entries; if (copy_to_user(arg, &p, sizeof(p))) { - io_register_free_rings(&p, &n); + io_register_free_rings(ctx, &p, &n); return -EFAULT; } @@ -461,7 +462,7 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg) else size = array_size(sizeof(struct io_uring_sqe), p.sq_entries); if (size == SIZE_MAX) { - io_register_free_rings(&p, &n); + io_register_free_rings(ctx, &p, &n); return -EOVERFLOW; } @@ -472,7 +473,7 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg) p.sq_off.user_addr, size); if (IS_ERR(ptr)) { - io_register_free_rings(&p, &n); + io_register_free_rings(ctx, &p, &n); return PTR_ERR(ptr); } @@ -562,7 +563,7 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg) out: spin_unlock(&ctx->completion_lock); mutex_unlock(&ctx->mmap_lock); - io_register_free_rings(&p, to_free); + io_register_free_rings(ctx, &p, to_free); if (ctx->sq_data) io_sq_thread_unpark(ctx->sq_data); From patchwork Fri Nov 29 13:34:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13888700 Received: from mail-ej1-f51.google.com (mail-ej1-f51.google.com [209.85.218.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9F98219CC27 for ; Fri, 29 Nov 2024 13:34:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887250; cv=none; b=g6PCvcBzaMShjytio5tpw/2I+X7Lez18r/3xCLiHCsZljotva643Ru0f5CGgACY5ABAXKcFOwhx7c0OjZdiAlbbOT7q6xmCahL6/hJhDQQX+T6377sKd0yePTB3RLxVmf33IdanKYLXy31O42bdt3peP04IPsPmH3Yp4zP1RYwE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887250; c=relaxed/simple; bh=wjljwNcWRvSitxAgtefrem368alAIQN9NIxmnK6T4Vg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=YkRBSwW7Zm3myYCr9xz8fFoOa8ZJ1QeM7mKY4J91yDzbilvMc9uoVytCrseIX1hxFfHB4cHh6qrTc4JEjEz2objqCEh1NRns4cCekMDVzpbKWzNp+NCmm9db5eiPQWtTvIY/abshIVJAfWm9SG1vlL+IGtgdzb4bS3hBNOX0t04= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=lbjQISo+; arc=none smtp.client-ip=209.85.218.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lbjQISo+" Received: by mail-ej1-f51.google.com with SMTP id a640c23a62f3a-a9a0ef5179dso248276266b.1 for ; Fri, 29 Nov 2024 05:34:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732887247; x=1733492047; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=uRWpj51Ng8QyzFNY/AhbXswUEoPBYVhTvkzXhDGFp3Y=; b=lbjQISo+mzmrYKrv0/FTP3LgDvFH41n1+QNaHTHi0E1rFslI4B/OyKvXNC7G169T63 knuFvxJSiJsGLceWEl9ppYu54kzT3eaPlUnKOCuJ+tMS/5KT0DNMbW0H/aAhcO3AGjDV czW9NYHnwZKmCLXBBE0AoTZHA7YbEiFEsdNNwKMOvZTUd4oeawRMalEv7T3LPKDUROgv Uvq8h6So38PZQ/v/syys5G5mlVQxr+TP9C80YI2ejDRrv1xqM95zPToItVgLk38ACYcW bLlNyhp/3BCieyCy3VbtVNQVgs1JvK8swyuMrMWK1GSkj3/pzk8I8HqP/MpIv03kCpZ2 JQ7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732887247; x=1733492047; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uRWpj51Ng8QyzFNY/AhbXswUEoPBYVhTvkzXhDGFp3Y=; b=kG9rBkpZPG/dXDxo3LVoaaWN2fO95pRRLTPVMKaY4oH8W4P3AFUzxO17pRWoVeyort N4Vz2EPhiBPStdztWi5VcljgFyTGQP6yZF8g9VZTU9rX6FH4uqsW/IxzVKmG8OD3Ohun 8edIxW1JBAjAxIIkWqjGyVCVL1OwjW2UBoBn2Hj8nbKi6loIfSYM+ssUDDyuElK29VKw HEWm5mmiL1/FKEQV1AwEakxwXwamR3Pfw7mF4dxiuIJfxKiWHSCd+ShQzdXPYtZS5n83 kRMf1ATFZqC6sDrCmg33ZA4AX/EQ2gVcjBN0/vg9IukV5MJ7IyahaKFsn7lC749pIs7O bNlQ== X-Gm-Message-State: AOJu0Yxv0CCl/+WLaITDaOy+FGl+fr2v3iqG85ZuvEdvtp+vIVcPSqaL ztSxhvNjaLmnilzgzuP5nKWfAIxaoBpJZMSJ1VaL/qWlWy/CcKi3nS93pw== X-Gm-Gg: ASbGncuwOwXIfKNZsZDHk6RUunLn9aVQcxDMEtTkuLyOFLIG/K6wD++rCigGAPR22Op /gJq57dq40gXLOPBzB7YqKdzyhppXE9LHCyphrniMN1afXZyWFwxoM5lrO05xn0duwB32sswF+Z iKU0Q0q0R4hCqfl18XVpprgrxmuCoee+HEQslvANK435jYzacqPMhyDT9O+EKYujdbQAkvv37pY sRAnb5NZqTuBmzT28XWtNQQARBIzuBUW+1n8QLdneCiWrZCd6heGMcckD7ooYCi X-Google-Smtp-Source: AGHT+IFYS20YLqPiN5HXdavcMN+ruNsnj7H1ngcSjFAPTMZuJpWAwhaxW8b3D18IakZqH5ezMp3/Fw== X-Received: by 2002:a17:906:32d7:b0:aa5:30a6:13d3 with SMTP id a640c23a62f3a-aa580f35501mr867463766b.27.1732887246367; Fri, 29 Nov 2024 05:34:06 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa5996c2471sm173996866b.13.2024.11.29.05.34.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:34:05 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 13/18] io_uring: use region api for SQ Date: Fri, 29 Nov 2024 13:34:34 +0000 Message-ID: <1fb73ced6b835cb319ab0fe1dc0b2e982a9a5650.1732886067.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Convert internal parts of the SQ managment to the region API. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 3 +-- io_uring/io_uring.c | 36 +++++++++++++--------------------- io_uring/memmap.c | 3 +-- io_uring/register.c | 35 +++++++++++++++------------------ 4 files changed, 32 insertions(+), 45 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 4cee414080fd..3f353f269c6e 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -431,10 +431,9 @@ struct io_ring_ctx { * the gup'ed pages for the two rings, and the sqes. */ unsigned short n_ring_pages; - unsigned short n_sqe_pages; struct page **ring_pages; - struct page **sqe_pages; + struct io_mapped_region sq_region; /* used for optimised request parameter and wait argument passing */ struct io_mapped_region param_region; }; diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index c713ef35447b..2ac80b4d4016 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2637,29 +2637,19 @@ static void *io_rings_map(struct io_ring_ctx *ctx, unsigned long uaddr, size); } -static void *io_sqes_map(struct io_ring_ctx *ctx, unsigned long uaddr, - size_t size) -{ - return __io_uaddr_map(&ctx->sqe_pages, &ctx->n_sqe_pages, uaddr, - size); -} - static void io_rings_free(struct io_ring_ctx *ctx) { if (!(ctx->flags & IORING_SETUP_NO_MMAP)) { io_pages_unmap(ctx->rings, &ctx->ring_pages, &ctx->n_ring_pages, true); - io_pages_unmap(ctx->sq_sqes, &ctx->sqe_pages, &ctx->n_sqe_pages, - true); } else { io_pages_free(&ctx->ring_pages, ctx->n_ring_pages); ctx->n_ring_pages = 0; - io_pages_free(&ctx->sqe_pages, ctx->n_sqe_pages); - ctx->n_sqe_pages = 0; vunmap(ctx->rings); - vunmap(ctx->sq_sqes); } + io_free_region(ctx, &ctx->sq_region); + ctx->rings = NULL; ctx->sq_sqes = NULL; } @@ -3476,9 +3466,10 @@ bool io_is_uring_fops(struct file *file) static __cold int io_allocate_scq_urings(struct io_ring_ctx *ctx, struct io_uring_params *p) { + struct io_uring_region_desc rd; struct io_rings *rings; size_t size, sq_array_offset; - void *ptr; + int ret; /* make sure these are sane, as we already accounted them */ ctx->sq_entries = p->sq_entries; @@ -3514,17 +3505,18 @@ static __cold int io_allocate_scq_urings(struct io_ring_ctx *ctx, return -EOVERFLOW; } - if (!(ctx->flags & IORING_SETUP_NO_MMAP)) - ptr = io_pages_map(&ctx->sqe_pages, &ctx->n_sqe_pages, size); - else - ptr = io_sqes_map(ctx, p->sq_off.user_addr, size); - - if (IS_ERR(ptr)) { + memset(&rd, 0, sizeof(rd)); + rd.size = PAGE_ALIGN(size); + if (ctx->flags & IORING_SETUP_NO_MMAP) { + rd.user_addr = p->sq_off.user_addr; + rd.flags |= IORING_MEM_REGION_TYPE_USER; + } + ret = io_create_region(ctx, &ctx->sq_region, &rd, IORING_OFF_SQES); + if (ret) { io_rings_free(ctx); - return PTR_ERR(ptr); + return ret; } - - ctx->sq_sqes = ptr; + ctx->sq_sqes = io_region_get_ptr(&ctx->sq_region); return 0; } diff --git a/io_uring/memmap.c b/io_uring/memmap.c index 9a182c8a4be1..b9aaa25182a5 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -474,8 +474,7 @@ __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) npages = min(ctx->n_ring_pages, (sz + PAGE_SIZE - 1) >> PAGE_SHIFT); return io_uring_mmap_pages(ctx, vma, ctx->ring_pages, npages); case IORING_OFF_SQES: - return io_uring_mmap_pages(ctx, vma, ctx->sqe_pages, - ctx->n_sqe_pages); + return io_region_mmap(ctx, &ctx->sq_region, vma); case IORING_OFF_PBUF_RING: return io_pbuf_mmap(file, vma); case IORING_MAP_OFF_PARAM_REGION: diff --git a/io_uring/register.c b/io_uring/register.c index 5e07205fb071..44cd64923d31 100644 --- a/io_uring/register.c +++ b/io_uring/register.c @@ -368,11 +368,11 @@ static int io_register_clock(struct io_ring_ctx *ctx, */ struct io_ring_ctx_rings { unsigned short n_ring_pages; - unsigned short n_sqe_pages; struct page **ring_pages; - struct page **sqe_pages; - struct io_uring_sqe *sq_sqes; struct io_rings *rings; + + struct io_uring_sqe *sq_sqes; + struct io_mapped_region sq_region; }; static void io_register_free_rings(struct io_ring_ctx *ctx, @@ -382,14 +382,11 @@ static void io_register_free_rings(struct io_ring_ctx *ctx, if (!(p->flags & IORING_SETUP_NO_MMAP)) { io_pages_unmap(r->rings, &r->ring_pages, &r->n_ring_pages, true); - io_pages_unmap(r->sq_sqes, &r->sqe_pages, &r->n_sqe_pages, - true); } else { io_pages_free(&r->ring_pages, r->n_ring_pages); - io_pages_free(&r->sqe_pages, r->n_sqe_pages); vunmap(r->rings); - vunmap(r->sq_sqes); } + io_free_region(ctx, &r->sq_region); } #define swap_old(ctx, o, n, field) \ @@ -404,11 +401,11 @@ static void io_register_free_rings(struct io_ring_ctx *ctx, static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg) { + struct io_uring_region_desc rd; struct io_ring_ctx_rings o = { }, n = { }, *to_free = NULL; size_t size, sq_array_offset; struct io_uring_params p; unsigned i, tail; - void *ptr; int ret; /* for single issuer, must be owner resizing */ @@ -466,16 +463,18 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg) return -EOVERFLOW; } - if (!(p.flags & IORING_SETUP_NO_MMAP)) - ptr = io_pages_map(&n.sqe_pages, &n.n_sqe_pages, size); - else - ptr = __io_uaddr_map(&n.sqe_pages, &n.n_sqe_pages, - p.sq_off.user_addr, - size); - if (IS_ERR(ptr)) { + memset(&rd, 0, sizeof(rd)); + rd.size = PAGE_ALIGN(size); + if (p.flags & IORING_SETUP_NO_MMAP) { + rd.user_addr = p.sq_off.user_addr; + rd.flags |= IORING_MEM_REGION_TYPE_USER; + } + ret = io_create_region_mmap_safe(ctx, &n.sq_region, &rd, IORING_OFF_SQES); + if (ret) { io_register_free_rings(ctx, &p, &n); - return PTR_ERR(ptr); + return ret; } + n.sq_sqes = io_region_get_ptr(&n.sq_region); /* * If using SQPOLL, park the thread @@ -506,7 +505,6 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg) * Now copy SQ and CQ entries, if any. If either of the destination * rings can't hold what is already there, then fail the operation. */ - n.sq_sqes = ptr; tail = o.rings->sq.tail; if (tail - o.rings->sq.head > p.sq_entries) goto overflow; @@ -555,9 +553,8 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg) ctx->rings = n.rings; ctx->sq_sqes = n.sq_sqes; swap_old(ctx, o, n, n_ring_pages); - swap_old(ctx, o, n, n_sqe_pages); swap_old(ctx, o, n, ring_pages); - swap_old(ctx, o, n, sqe_pages); + swap_old(ctx, o, n, sq_region); to_free = &o; ret = 0; out: From patchwork Fri Nov 29 13:34:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13888701 Received: from mail-lf1-f47.google.com (mail-lf1-f47.google.com [209.85.167.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ADBEC19E99A for ; Fri, 29 Nov 2024 13:34:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887251; cv=none; b=bVQvCWtv8x3vkqJZVN2Hov8GfEGpTp9jCUib2NOlsf7FExpES+YFsO2xCCyjWnGonSuh3PRLMR5u7KELImXCN/rWXDxGaITfOTpZmyN20czDhYC5gwWg87nto89KCpw1c8ioc8VQKm4XzHPGX5bW8EEqvuZqZ9p+GjevkBCxdQg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887251; c=relaxed/simple; bh=Z1JNsz7s9fM29IKe+kETGMzNtjoeevXqbsgA5LxWCyU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MygTGYjA+4pChx6YMc3R89XtcZ4kmhyJqFdK0qmN4I1HV3N9xFG/QYBBq7LXEL+6oi/XsVgEifOXh0cegQAeK7Mr6jGCnX9cU9XefK4uvFlAVUiF8Q+gkOTkGjN9RBZfgRycqiaescmM0zvdYODGiUQtEN4UkGnb2cfjVap/zHs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=cOq4I7my; arc=none smtp.client-ip=209.85.167.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="cOq4I7my" Received: by mail-lf1-f47.google.com with SMTP id 2adb3069b0e04-53dd2fdcebcso2251929e87.0 for ; Fri, 29 Nov 2024 05:34:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732887247; x=1733492047; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=pUrcV/+245WYhaI60GoMPB6fD2CLfdvaS+bPhyo8mxE=; b=cOq4I7myvmCsKi/lWeH6JSr3SV55lCzUt7dkhswr3hbTnr1ZfNrdcTPZAb2vInE7Ly 5CsRD3mQCGpkdQTn3fft+MGkwqHBqhKAxR/klM9ndq+F1+PUl2bFomNbrR95m0g3H7Ro 3gQGG1TKXcTKG+RH/PiXd6sOXX46e+Gfjl9xG4Xh5v3VO9Bk4Bjn7fs8cTukVe0sTHR7 3WWm9jtxqe9vJuHhvnTp9tMZc7KWuYiC2HM4Iqh8wPeaTRZ7x8uojcKgZl4Y59Yo47do rG1VG8d/2v62i8R5nH/lwl+s/Ofo29iLAYLMIcx48kdNZfQofi5U8t1vcMR+5pKYKqJY Z5yg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732887247; x=1733492047; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pUrcV/+245WYhaI60GoMPB6fD2CLfdvaS+bPhyo8mxE=; b=T3QgQQDdPUMv1HqHOJOedxin9OpAD8ZcvJeEa9leuItEKN3jMep+j/aNe2R7AdqNTu CY9wBNu68sut74YeFAznwhe/yPFvmuTtkkgkg+3g/a02532T+qzo6O8Hg8io4zU1t6Pm im+0uBothgXUM93Sm5jv2miHGTi7WaOti/+wzKtqZ9hEPAoow9UuAWf0Ez5HMOmHhP3l 9OwUq6p2JLUAMgdrPDr90gXQkHKlwWn3CQqQDh2aTQzTNgbiSFwanE5TQFzoo1678WPG qHdxxXNjAb8vJO4OMkNivERMDUsjCoUdHxxnC9KSl3C6uG2sISMC9x6a0YwwnEv5KLQe J1Cg== X-Gm-Message-State: AOJu0YzPvHJT7T/DMbWOv1az0xzzMUcYS6zFvJS/kPu3oQh+HimeLiyR P38By/NzFrAYtERkgqTLnco1p2IaYpTwlrIlS5P3ptxMpJzGuCRKpIG/8A== X-Gm-Gg: ASbGncvhM5opqMdnJ69+gNl3rLbfwQ5EtcWbsXhBitr3XwAxper4Gt4sPLlAdvcF0fq Du0f0P1OkM8Fddh5FbsqYqQlhXjfzbrErRNtNFQUArKowEPOX7T4asVtqpX67bXpjbRK75DaBM+ 1RB/FrEm3fxIy2atvmUFP4+lSQtdSuP68ZhKys/KC7CWRRRyQZ/GbJxiKtLWNg4cLDA52+NyDxW eXOjc7zm/8gWOxbxyv0wgbf9gDFOqNn9cszDyVLOi6yMKi4I9YMy/O6lS2+F3XU X-Google-Smtp-Source: AGHT+IE2kX/SQASdXVk7VG0DWiGp8aeNi26RUtj8fzQwqOsNg0BBiKVXgapXfM1D5onVOMQ73oCMVA== X-Received: by 2002:a05:6512:3a8e:b0:53d:ede4:35ff with SMTP id 2adb3069b0e04-53df00ff716mr7721731e87.38.1732887247285; Fri, 29 Nov 2024 05:34:07 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa5996c2471sm173996866b.13.2024.11.29.05.34.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:34:06 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 14/18] io_uring: use region api for CQ Date: Fri, 29 Nov 2024 13:34:35 +0000 Message-ID: <46fc3c801290d6b1ac16023d78f6b8e685c87fd6.1732886067.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Convert internal parts of the CQ/SQ array managment to the region API. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 8 +---- io_uring/io_uring.c | 36 +++++++--------------- io_uring/memmap.c | 55 +++++----------------------------- io_uring/memmap.h | 4 --- io_uring/register.c | 35 ++++++++++------------ 5 files changed, 36 insertions(+), 102 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 3f353f269c6e..2db252841509 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -426,14 +426,8 @@ struct io_ring_ctx { */ struct mutex mmap_lock; - /* - * If IORING_SETUP_NO_MMAP is used, then the below holds - * the gup'ed pages for the two rings, and the sqes. - */ - unsigned short n_ring_pages; - struct page **ring_pages; - struct io_mapped_region sq_region; + struct io_mapped_region ring_region; /* used for optimised request parameter and wait argument passing */ struct io_mapped_region param_region; }; diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 2ac80b4d4016..bc0ab2bb7ae2 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2630,26 +2630,10 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events, u32 flags, return READ_ONCE(rings->cq.head) == READ_ONCE(rings->cq.tail) ? ret : 0; } -static void *io_rings_map(struct io_ring_ctx *ctx, unsigned long uaddr, - size_t size) -{ - return __io_uaddr_map(&ctx->ring_pages, &ctx->n_ring_pages, uaddr, - size); -} - static void io_rings_free(struct io_ring_ctx *ctx) { - if (!(ctx->flags & IORING_SETUP_NO_MMAP)) { - io_pages_unmap(ctx->rings, &ctx->ring_pages, &ctx->n_ring_pages, - true); - } else { - io_pages_free(&ctx->ring_pages, ctx->n_ring_pages); - ctx->n_ring_pages = 0; - vunmap(ctx->rings); - } - io_free_region(ctx, &ctx->sq_region); - + io_free_region(ctx, &ctx->ring_region); ctx->rings = NULL; ctx->sq_sqes = NULL; } @@ -3480,15 +3464,17 @@ static __cold int io_allocate_scq_urings(struct io_ring_ctx *ctx, if (size == SIZE_MAX) return -EOVERFLOW; - if (!(ctx->flags & IORING_SETUP_NO_MMAP)) - rings = io_pages_map(&ctx->ring_pages, &ctx->n_ring_pages, size); - else - rings = io_rings_map(ctx, p->cq_off.user_addr, size); - - if (IS_ERR(rings)) - return PTR_ERR(rings); + memset(&rd, 0, sizeof(rd)); + rd.size = PAGE_ALIGN(size); + if (ctx->flags & IORING_SETUP_NO_MMAP) { + rd.user_addr = p->cq_off.user_addr; + rd.flags |= IORING_MEM_REGION_TYPE_USER; + } + ret = io_create_region(ctx, &ctx->ring_region, &rd, IORING_OFF_CQ_RING); + if (ret) + return ret; + ctx->rings = rings = io_region_get_ptr(&ctx->ring_region); - ctx->rings = rings; if (!(ctx->flags & IORING_SETUP_NO_SQARRAY)) ctx->sq_array = (u32 *)((char *)rings + sq_array_offset); rings->sq_ring_mask = p->sq_entries - 1; diff --git a/io_uring/memmap.c b/io_uring/memmap.c index b9aaa25182a5..668b1c3579a2 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -120,18 +120,6 @@ void io_pages_unmap(void *ptr, struct page ***pages, unsigned short *npages, *npages = 0; } -void io_pages_free(struct page ***pages, int npages) -{ - struct page **page_array = *pages; - - if (!page_array) - return; - - unpin_user_pages(page_array, npages); - kvfree(page_array); - *pages = NULL; -} - struct page **io_pin_pages(unsigned long uaddr, unsigned long len, int *npages) { unsigned long start, end, nr_pages; @@ -174,34 +162,6 @@ struct page **io_pin_pages(unsigned long uaddr, unsigned long len, int *npages) return ERR_PTR(ret); } -void *__io_uaddr_map(struct page ***pages, unsigned short *npages, - unsigned long uaddr, size_t size) -{ - struct page **page_array; - unsigned int nr_pages; - void *page_addr; - - *npages = 0; - - if (uaddr & (PAGE_SIZE - 1) || !size) - return ERR_PTR(-EINVAL); - - nr_pages = 0; - page_array = io_pin_pages(uaddr, size, &nr_pages); - if (IS_ERR(page_array)) - return page_array; - - page_addr = vmap(page_array, nr_pages, VM_MAP, PAGE_KERNEL); - if (page_addr) { - *pages = page_array; - *npages = nr_pages; - return page_addr; - } - - io_pages_free(&page_array, nr_pages); - return ERR_PTR(-ENOMEM); -} - enum { /* memory was vmap'ed for the kernel, freeing the region vunmap's it */ IO_REGION_F_VMAP = 1, @@ -446,9 +406,10 @@ int io_uring_mmap_pages(struct io_ring_ctx *ctx, struct vm_area_struct *vma, static int io_region_mmap(struct io_ring_ctx *ctx, struct io_mapped_region *mr, - struct vm_area_struct *vma) + struct vm_area_struct *vma, + unsigned max_pages) { - unsigned long nr_pages = mr->nr_pages; + unsigned long nr_pages = min(mr->nr_pages, max_pages); vm_flags_set(vma, VM_DONTEXPAND); return vm_insert_pages(vma, vma->vm_start, mr->pages, &nr_pages); @@ -459,7 +420,7 @@ __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) struct io_ring_ctx *ctx = file->private_data; size_t sz = vma->vm_end - vma->vm_start; long offset = vma->vm_pgoff << PAGE_SHIFT; - unsigned int npages; + unsigned int page_limit; void *ptr; guard(mutex)(&ctx->mmap_lock); @@ -471,14 +432,14 @@ __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) switch (offset & IORING_OFF_MMAP_MASK) { case IORING_OFF_SQ_RING: case IORING_OFF_CQ_RING: - npages = min(ctx->n_ring_pages, (sz + PAGE_SIZE - 1) >> PAGE_SHIFT); - return io_uring_mmap_pages(ctx, vma, ctx->ring_pages, npages); + page_limit = (sz + PAGE_SIZE - 1) >> PAGE_SHIFT; + return io_region_mmap(ctx, &ctx->ring_region, vma, page_limit); case IORING_OFF_SQES: - return io_region_mmap(ctx, &ctx->sq_region, vma); + return io_region_mmap(ctx, &ctx->sq_region, vma, UINT_MAX); case IORING_OFF_PBUF_RING: return io_pbuf_mmap(file, vma); case IORING_MAP_OFF_PARAM_REGION: - return io_region_mmap(ctx, &ctx->param_region, vma); + return io_region_mmap(ctx, &ctx->param_region, vma, UINT_MAX); } return -EINVAL; diff --git a/io_uring/memmap.h b/io_uring/memmap.h index 2402bca3d700..7395996eb353 100644 --- a/io_uring/memmap.h +++ b/io_uring/memmap.h @@ -4,7 +4,6 @@ #define IORING_MAP_OFF_PARAM_REGION 0x20000000ULL struct page **io_pin_pages(unsigned long ubuf, unsigned long len, int *npages); -void io_pages_free(struct page ***pages, int npages); int io_uring_mmap_pages(struct io_ring_ctx *ctx, struct vm_area_struct *vma, struct page **pages, int npages); @@ -13,9 +12,6 @@ void *io_pages_map(struct page ***out_pages, unsigned short *npages, void io_pages_unmap(void *ptr, struct page ***pages, unsigned short *npages, bool put_pages); -void *__io_uaddr_map(struct page ***pages, unsigned short *npages, - unsigned long uaddr, size_t size); - #ifndef CONFIG_MMU unsigned int io_uring_nommu_mmap_capabilities(struct file *file); #endif diff --git a/io_uring/register.c b/io_uring/register.c index 44cd64923d31..f1698c18c7cb 100644 --- a/io_uring/register.c +++ b/io_uring/register.c @@ -367,26 +367,19 @@ static int io_register_clock(struct io_ring_ctx *ctx, * either mapping or freeing. */ struct io_ring_ctx_rings { - unsigned short n_ring_pages; - struct page **ring_pages; struct io_rings *rings; - struct io_uring_sqe *sq_sqes; + struct io_mapped_region sq_region; + struct io_mapped_region ring_region; }; static void io_register_free_rings(struct io_ring_ctx *ctx, struct io_uring_params *p, struct io_ring_ctx_rings *r) { - if (!(p->flags & IORING_SETUP_NO_MMAP)) { - io_pages_unmap(r->rings, &r->ring_pages, &r->n_ring_pages, - true); - } else { - io_pages_free(&r->ring_pages, r->n_ring_pages); - vunmap(r->rings); - } io_free_region(ctx, &r->sq_region); + io_free_region(ctx, &r->ring_region); } #define swap_old(ctx, o, n, field) \ @@ -436,13 +429,18 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg) if (size == SIZE_MAX) return -EOVERFLOW; - if (!(p.flags & IORING_SETUP_NO_MMAP)) - n.rings = io_pages_map(&n.ring_pages, &n.n_ring_pages, size); - else - n.rings = __io_uaddr_map(&n.ring_pages, &n.n_ring_pages, - p.cq_off.user_addr, size); - if (IS_ERR(n.rings)) - return PTR_ERR(n.rings); + memset(&rd, 0, sizeof(rd)); + rd.size = PAGE_ALIGN(size); + if (p.flags & IORING_SETUP_NO_MMAP) { + rd.user_addr = p.cq_off.user_addr; + rd.flags |= IORING_MEM_REGION_TYPE_USER; + } + ret = io_create_region_mmap_safe(ctx, &n.ring_region, &rd, IORING_OFF_CQ_RING); + if (ret) { + io_register_free_rings(ctx, &p, &n); + return ret; + } + n.rings = io_region_get_ptr(&n.ring_region); n.rings->sq_ring_mask = p.sq_entries - 1; n.rings->cq_ring_mask = p.cq_entries - 1; @@ -552,8 +550,7 @@ static int io_register_resize_rings(struct io_ring_ctx *ctx, void __user *arg) ctx->rings = n.rings; ctx->sq_sqes = n.sq_sqes; - swap_old(ctx, o, n, n_ring_pages); - swap_old(ctx, o, n, ring_pages); + swap_old(ctx, o, n, ring_region); swap_old(ctx, o, n, sq_region); to_free = &o; ret = 0; From patchwork Fri Nov 29 13:34:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13888702 Received: from mail-lj1-f171.google.com (mail-lj1-f171.google.com [209.85.208.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9907D19E980 for ; Fri, 29 Nov 2024 13:34:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887252; cv=none; b=U6BVvatg5E+XjvmrttZowHK8mCtZMMPtkwrI5j4KFFS3bCPgFTAVMwEZttrmVYowROa4kpeGkwi4wuPCSlkwf/bkPg76hBLAFv1jRDaMylrVj/HXqfnvBa109ODW3X62Oi/PU8U/W8Z/YGJbpAYhdADwwEJDa/GILYCf5+AG2Vw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887252; c=relaxed/simple; bh=i4q5zoh+RVcVUtG+p4DEvHPHxUCZOGcdxw6RV/ymmA0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FLGqn1DILU2yuo9+bE7yXuQOoF/RV6tma8xpfGt4fc9DajcSMR2qS85EpPrZSjSgWZ9UxXyo+d35xKm+KJ6RuVMSpyTrEoSWLzTt5FMBhgFssqio72YbLV+qeTUiH/pQS92bhWIkywHXFISHwrayqqOJu6om9ap7MSN6VM5pcSI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=eJy4hLyf; arc=none smtp.client-ip=209.85.208.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="eJy4hLyf" Received: by mail-lj1-f171.google.com with SMTP id 38308e7fff4ca-2ffc3f2b3a9so28133921fa.1 for ; Fri, 29 Nov 2024 05:34:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732887248; x=1733492048; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3MlVDCt73aMY59x9K9s6HGMef2pKI7M9nX1Jols9I8s=; b=eJy4hLyfWQbzalDW/SkuVVvnZo5O9wBHHnDO5dOtE1QQ/Qz5fW1PKh0q8nMSXbNAMl 4muNRfqTsgXOXlxN5CUvfdZ/9rXrX3+Laqdh3GaS7HWs5U3BQt2pPMsBv4AaOxg5+ta3 4+QyeCzlKHHAdFw8m7ERePgEsltgL8KnVQFoe5RyLLrQsLwjpsCZ/JAQYhSkfp3/36ih sSZsH4JLcnugJ8QSK/mUYTa8iBiSLNLoHbSUiWIO8/LNGUs3R9Au0/ES2JAVh8K1hOrP 5J2d23miFWWdFJuBypiXWNcYtAxLHKmDqkYFkrOEnvhcjA/GqfyAEL0/KP2TEkvtZ4tC 7b8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732887248; x=1733492048; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3MlVDCt73aMY59x9K9s6HGMef2pKI7M9nX1Jols9I8s=; b=vc7kmfgnxXID0/tVMcSRR5JPBq71mDPkK/TAslmWUydvtP/MQn3qmaBCfnWLLj6aOV QC28v4dEIc8KX/yQ6f1Bd85A/5gfi/R0EJFdEAhXKtZ/34nqs2eSnOkPcnR46h/Xy06q I1/C9BHG+fTaZTX943dSWgBxuEfSJNGlNQWccr3sWgI1W1NGrzIA5N08yr6hhHemZ+el V/JBkZYflr8Y4+YddzxmsM3F/jOPJTasMxyLniAImPPbbBoiP8oZ7a2ZeOo+TtXZOrvb 9ZWxdww2tb6Cqg8lOOVWWr3+yPy8CekGtIq73q6e/0S3Bp3IDbI6uvB1Fol1kpCYqQ8v enDw== X-Gm-Message-State: AOJu0Ywe4Cfi/bfwGSL4CBEXts94e90LD8NDpUvZ/ZRTWJDi4W/885I5 KK2rFPmNmjiXKqiSFqi4t1lX1DPNWVvq1IvDOiPO0/KCGkpZtrP6/dhq7w== X-Gm-Gg: ASbGncsgEREcQXHwdj2FbUaqfiZgIjkimfcdVzawKzq8bGe882dKOQzgc1u+fdr7cd8 H5A2iW1lxBlGPXV4DBbF6KsKTFcpGyLM5gFUcMFaPz6QsiChhCUrzR5nldfPCGkSGMI0GK916o8 e6/QiqnaGKfMGKupBT57mFyuHZNbPvn6nntwXUzKS2QVHRp/RtoYYh/RJDukk03GtT1BnI9qVHu sUknAoyOrx3InUybP+i6jGWiEBCyCV+6XmsrrhfNfZ+4pe4UE4QsvXOjOBogrjj X-Google-Smtp-Source: AGHT+IEC9Nr3jv9IOdfVLOqyjYo65D+PQvIY/jfS1Zo0sYD4VuowdY+npIwh8kBcJWpiomfb3Ks+xQ== X-Received: by 2002:a05:651c:508:b0:2ff:cb81:c016 with SMTP id 38308e7fff4ca-2ffd6028d77mr108125271fa.19.1732887248090; Fri, 29 Nov 2024 05:34:08 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa5996c2471sm173996866b.13.2024.11.29.05.34.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:34:07 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 15/18] io_uring/kbuf: use mmap_lock to sync with mmap Date: Fri, 29 Nov 2024 13:34:36 +0000 Message-ID: X-Mailer: git-send-email 2.47.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 A preparation / cleanup patch simplifying the buf ring - mmap synchronisation. Instead of relying on RCU, which is trickier, do it by grabbing the mmap_lock when when anyone tries to publish or remove a registered buffer to / from ->io_bl_xa. Modifications of the xarray should always be protected by both ->uring_lock and ->mmap_lock, while lookups should hold either of them. While a struct io_buffer_list is in the xarray, the mmap related fields like ->flags and ->buf_pages should stay stable. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 5 +++ io_uring/kbuf.c | 56 +++++++++++++++------------------- io_uring/kbuf.h | 1 - 3 files changed, 29 insertions(+), 33 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 2db252841509..091d1eaf5ba0 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -293,6 +293,11 @@ struct io_ring_ctx { struct io_submit_state submit_state; + /* + * Modifications are protected by ->uring_lock and ->mmap_lock. + * The flags, buf_pages and buf_nr_pages fields should be stable + * once published. + */ struct xarray io_bl_xa; struct io_hash_table cancel_table; diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c index d407576ddfb7..662e928cc3b0 100644 --- a/io_uring/kbuf.c +++ b/io_uring/kbuf.c @@ -45,10 +45,11 @@ static int io_buffer_add_list(struct io_ring_ctx *ctx, /* * Store buffer group ID and finally mark the list as visible. * The normal lookup doesn't care about the visibility as we're - * always under the ->uring_lock, but the RCU lookup from mmap does. + * always under the ->uring_lock, but lookups from mmap do. */ bl->bgid = bgid; atomic_set(&bl->refs, 1); + guard(mutex)(&ctx->mmap_lock); return xa_err(xa_store(&ctx->io_bl_xa, bgid, bl, GFP_KERNEL)); } @@ -388,7 +389,7 @@ void io_put_bl(struct io_ring_ctx *ctx, struct io_buffer_list *bl) { if (atomic_dec_and_test(&bl->refs)) { __io_remove_buffers(ctx, bl, -1U); - kfree_rcu(bl, rcu); + kfree(bl); } } @@ -397,10 +398,17 @@ void io_destroy_buffers(struct io_ring_ctx *ctx) struct io_buffer_list *bl; struct list_head *item, *tmp; struct io_buffer *buf; - unsigned long index; - xa_for_each(&ctx->io_bl_xa, index, bl) { - xa_erase(&ctx->io_bl_xa, bl->bgid); + while (1) { + unsigned long index = 0; + + scoped_guard(mutex, &ctx->mmap_lock) { + bl = xa_find(&ctx->io_bl_xa, &index, ULONG_MAX, XA_PRESENT); + if (bl) + xa_erase(&ctx->io_bl_xa, bl->bgid); + } + if (!bl) + break; io_put_bl(ctx, bl); } @@ -589,11 +597,7 @@ int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags) INIT_LIST_HEAD(&bl->buf_list); ret = io_buffer_add_list(ctx, bl, p->bgid); if (ret) { - /* - * Doesn't need rcu free as it was never visible, but - * let's keep it consistent throughout. - */ - kfree_rcu(bl, rcu); + kfree(bl); goto err; } } @@ -736,7 +740,7 @@ int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg) return 0; } - kfree_rcu(free_bl, rcu); + kfree(free_bl); return ret; } @@ -760,7 +764,9 @@ int io_unregister_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg) if (!(bl->flags & IOBL_BUF_RING)) return -EINVAL; - xa_erase(&ctx->io_bl_xa, bl->bgid); + scoped_guard(mutex, &ctx->mmap_lock) + xa_erase(&ctx->io_bl_xa, bl->bgid); + io_put_bl(ctx, bl); return 0; } @@ -795,29 +801,13 @@ struct io_buffer_list *io_pbuf_get_bl(struct io_ring_ctx *ctx, unsigned long bgid) { struct io_buffer_list *bl; - bool ret; - /* - * We have to be a bit careful here - we're inside mmap and cannot grab - * the uring_lock. This means the buffer_list could be simultaneously - * going away, if someone is trying to be sneaky. Look it up under rcu - * so we know it's not going away, and attempt to grab a reference to - * it. If the ref is already zero, then fail the mapping. If successful, - * the caller will call io_put_bl() to drop the the reference at at the - * end. This may then safely free the buffer_list (and drop the pages) - * at that point, vm_insert_pages() would've already grabbed the - * necessary vma references. - */ - rcu_read_lock(); bl = xa_load(&ctx->io_bl_xa, bgid); /* must be a mmap'able buffer ring and have pages */ - ret = false; - if (bl && bl->flags & IOBL_MMAP) - ret = atomic_inc_not_zero(&bl->refs); - rcu_read_unlock(); - - if (ret) - return bl; + if (bl && bl->flags & IOBL_MMAP) { + if (atomic_inc_not_zero(&bl->refs)) + return bl; + } return ERR_PTR(-EINVAL); } @@ -829,6 +819,8 @@ int io_pbuf_mmap(struct file *file, struct vm_area_struct *vma) struct io_buffer_list *bl; int bgid, ret; + lockdep_assert_held(&ctx->mmap_lock); + bgid = (pgoff & ~IORING_OFF_MMAP_MASK) >> IORING_OFF_PBUF_SHIFT; bl = io_pbuf_get_bl(ctx, bgid); if (IS_ERR(bl)) diff --git a/io_uring/kbuf.h b/io_uring/kbuf.h index 36aadfe5ac00..d5e4afcbfbb3 100644 --- a/io_uring/kbuf.h +++ b/io_uring/kbuf.h @@ -25,7 +25,6 @@ struct io_buffer_list { struct page **buf_pages; struct io_uring_buf_ring *buf_ring; }; - struct rcu_head rcu; }; __u16 bgid; From patchwork Fri Nov 29 13:34:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13888703 Received: from mail-ej1-f51.google.com (mail-ej1-f51.google.com [209.85.218.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D428519CC27 for ; Fri, 29 Nov 2024 13:34:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887252; cv=none; b=LKltf+dbTs3P8xruLdAShjK5MRd3rXTjG4EK5SKHhQTG2Q04K5OYMxx9otoAaqPxaGArCkDWcz2LUf/DzcyS1u3FdGWF9iH0quGovAs8+giG0FUFL/UkUpc+CX+J6bykZz+yo+RIOwgQE01DcF48hX0paQOrdUmqXpC17AK25ig= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887252; c=relaxed/simple; bh=RUAORC0z2zGFT8COGi88os5W9yBRTJewsClQX5h78kw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QxNeghuxpwHZ1upUJ+jUQ1+I5MNT2EXlXIUCoE3y+S+d/qTcqCeRHwvX8YTayWOWiY0PSWjPmngNE7ZrR9/BynRFsLb3M848UX7C7xWUJFIDxdsPcTE7i+bC73JxVEdbe6DQHGCQxEXGjlZMdxV6BUXnjRmEYStU0OJmsQKZMkU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=E3/8JUh2; arc=none smtp.client-ip=209.85.218.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="E3/8JUh2" Received: by mail-ej1-f51.google.com with SMTP id a640c23a62f3a-aa53a971480so251455666b.1 for ; Fri, 29 Nov 2024 05:34:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732887249; x=1733492049; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=u4UMJAzKdYLVQw7XV+oSOHtwiytjSC+QvOsVrzAustw=; b=E3/8JUh2+kegIPZZ8/TS9QktQ6m2lWUmFdyo+VksXWlpxgrBc+aVgn1tpFAjeeKBPr as57PdRJVTtwWODBsfx1lUnylvre0uXx41C7s/C1R4DUklQgz9giU4PntNkbPphrvabD 3h1sq2/RGa0oBDEatr4dBSGDG6SZmDa/cdviWs/ch4f+r9AGoHLnxv7kRxZmWY01ErtY 14ZJS9kL3Y4fm/yr0FgQuB8vv/fENkTlt4pFOLcTdTTk//fDU2yHyrOj9jm72qd+35BP T47b/i6YJGbYKFE912zrLBcTDLMFt+fZ/ypzOzA1TrnIap8c3NlYARje0mF18hYDSzsx ux0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732887249; x=1733492049; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=u4UMJAzKdYLVQw7XV+oSOHtwiytjSC+QvOsVrzAustw=; b=Hr/JX4FgQKlZrONNVcoQNJ1oL6IS27vToszhgOja2VGBO6rkGhXbS1Ajx+EE6+U2mi o/qnch5hSvjqlxno9V55QtJNR38lij03mmCC67xiAS2yBrggerNE+gYLfdaSJ+uvxdcZ 4H4fBLn7SOaKIFF6zK8yn8P5kB+7Dhr6dMwovUUl3lv1Y7kZ1edca2+U8vpEE9iNvMBS 1VoFazq32ump+UPSBVmrA7tBBqwmD9JhoCCNf4fxzKiIAUJxxtqO1BtwSvTjZkhF+g6r dt7AuUvH8A3LPljUMjLci9uZpvogcnBLoGvsLP+isfAZ5GB+5lUizeKXjXMNpskLHIjJ Evyw== X-Gm-Message-State: AOJu0YxK42mxBAn45KGX2CX4IUPmQuTCyH3wDUZjqKmmgk0/xR71OR1N yrlMkhlPqsL6b9ixQ0NUHuJZGHMns5KL2AelygF9bk1nBPJ7LEt5xE89PQ== X-Gm-Gg: ASbGncs2Zglvr6cztQKSXx5vYBYqdMm1lMWn+9rb4VIhyMD223cYuxhItX42Iju34m5 Dq4yHUvry6U2As1+NNbmP3KUp+njGcfJE8jvRQFFZ3UL+BujlE/IYQHKfqbjEjHDexZyHN1JVdI hem84RBKeXQ8jd3Gw0ugqM4w0586BrNFa++ON8FX/a5au5EOP22oKTB0fb/ojvLiytPh/PPWeDk K32kdEcmEdG71Nu0rZ/34kW/lSyrUXHdld1tPMHSpuCBUnRVw1dVIdEt+h76lMh X-Google-Smtp-Source: AGHT+IFVbbe6bZpLwZbx4Ki6eBspXglMQYh7Ks1e2jhMfk/HvTXGl2BFtxU8TZ2R9ifueiyUi1q5Yw== X-Received: by 2002:a17:906:31cc:b0:a9a:bbcc:508c with SMTP id a640c23a62f3a-aa580ef2b02mr829547866b.2.1732887248786; Fri, 29 Nov 2024 05:34:08 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa5996c2471sm173996866b.13.2024.11.29.05.34.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:34:08 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 16/18] io_uring/kbuf: remove pbuf ring refcounting Date: Fri, 29 Nov 2024 13:34:37 +0000 Message-ID: <4a9cc54bf0077bb2bf2f3daf917549ddd41080da.1732886067.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 struct io_buffer_list refcounting was needed for RCU based sync with mmap, now we can kill it. Signed-off-by: Pavel Begunkov --- io_uring/kbuf.c | 21 +++++++-------------- io_uring/kbuf.h | 3 --- io_uring/memmap.c | 1 - 3 files changed, 7 insertions(+), 18 deletions(-) diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c index 662e928cc3b0..644f61445ec9 100644 --- a/io_uring/kbuf.c +++ b/io_uring/kbuf.c @@ -48,7 +48,6 @@ static int io_buffer_add_list(struct io_ring_ctx *ctx, * always under the ->uring_lock, but lookups from mmap do. */ bl->bgid = bgid; - atomic_set(&bl->refs, 1); guard(mutex)(&ctx->mmap_lock); return xa_err(xa_store(&ctx->io_bl_xa, bgid, bl, GFP_KERNEL)); } @@ -385,12 +384,10 @@ static int __io_remove_buffers(struct io_ring_ctx *ctx, return i; } -void io_put_bl(struct io_ring_ctx *ctx, struct io_buffer_list *bl) +static void io_put_bl(struct io_ring_ctx *ctx, struct io_buffer_list *bl) { - if (atomic_dec_and_test(&bl->refs)) { - __io_remove_buffers(ctx, bl, -1U); - kfree(bl); - } + __io_remove_buffers(ctx, bl, -1U); + kfree(bl); } void io_destroy_buffers(struct io_ring_ctx *ctx) @@ -804,10 +801,8 @@ struct io_buffer_list *io_pbuf_get_bl(struct io_ring_ctx *ctx, bl = xa_load(&ctx->io_bl_xa, bgid); /* must be a mmap'able buffer ring and have pages */ - if (bl && bl->flags & IOBL_MMAP) { - if (atomic_inc_not_zero(&bl->refs)) - return bl; - } + if (bl && bl->flags & IOBL_MMAP) + return bl; return ERR_PTR(-EINVAL); } @@ -817,7 +812,7 @@ int io_pbuf_mmap(struct file *file, struct vm_area_struct *vma) struct io_ring_ctx *ctx = file->private_data; loff_t pgoff = vma->vm_pgoff << PAGE_SHIFT; struct io_buffer_list *bl; - int bgid, ret; + int bgid; lockdep_assert_held(&ctx->mmap_lock); @@ -826,7 +821,5 @@ int io_pbuf_mmap(struct file *file, struct vm_area_struct *vma) if (IS_ERR(bl)) return PTR_ERR(bl); - ret = io_uring_mmap_pages(ctx, vma, bl->buf_pages, bl->buf_nr_pages); - io_put_bl(ctx, bl); - return ret; + return io_uring_mmap_pages(ctx, vma, bl->buf_pages, bl->buf_nr_pages); } diff --git a/io_uring/kbuf.h b/io_uring/kbuf.h index d5e4afcbfbb3..dff7444026a6 100644 --- a/io_uring/kbuf.h +++ b/io_uring/kbuf.h @@ -35,8 +35,6 @@ struct io_buffer_list { __u16 mask; __u16 flags; - - atomic_t refs; }; struct io_buffer { @@ -83,7 +81,6 @@ void __io_put_kbuf(struct io_kiocb *req, int len, unsigned issue_flags); bool io_kbuf_recycle_legacy(struct io_kiocb *req, unsigned issue_flags); -void io_put_bl(struct io_ring_ctx *ctx, struct io_buffer_list *bl); struct io_buffer_list *io_pbuf_get_bl(struct io_ring_ctx *ctx, unsigned long bgid); int io_pbuf_mmap(struct file *file, struct vm_area_struct *vma); diff --git a/io_uring/memmap.c b/io_uring/memmap.c index 668b1c3579a2..73b73f4ea1bd 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -383,7 +383,6 @@ static void *io_uring_validate_mmap_request(struct file *file, loff_t pgoff, if (IS_ERR(bl)) return bl; ptr = bl->buf_ring; - io_put_bl(ctx, bl); return ptr; } case IORING_MAP_OFF_PARAM_REGION: From patchwork Fri Nov 29 13:34:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13888704 Received: from mail-ej1-f44.google.com (mail-ej1-f44.google.com [209.85.218.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D078E1990C1 for ; Fri, 29 Nov 2024 13:34:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887253; cv=none; b=bPi4eSBMWMVVDl0Yy6IqzOZwdrX8NxZEVlwZrCso6jHXiXf2SKI75K79Fyo7SFe/tc2GP0Xv358IEH3gYGO7MUpe0Tram5B6bWF0sBd/slePrRJo402+WHKEe8V0TYjSYZXvDoXJ8fmXXD7ZPghXlYmNkGK7DvnQvSUH6HOwb48= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887253; c=relaxed/simple; bh=/iu3VG8B1UqcUS+gaZc9XgbowpzWZsCWyRKIDiN7XRA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gcPSJP/FsnvwxNGS3Rku5bHjPzOuUGqR5BeuAWvxzGjuYoP91Hl8MVmqF/alh3Y/wUdw063+yxEsEuXg7K8adxxjqbb5gKLHMXaYM8eRYmnNd5FjNJaN4rFkwbG9rWrmKYAYQI4I1polOHk9lvjKsz3oXl8iEILK1obnJOrOO1M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=MNdmlqhj; arc=none smtp.client-ip=209.85.218.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MNdmlqhj" Received: by mail-ej1-f44.google.com with SMTP id a640c23a62f3a-aa560a65fd6so333352166b.0 for ; Fri, 29 Nov 2024 05:34:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732887250; x=1733492050; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=38ES6p6U/RNwBbX2ML0eCmvdyQ+owZSegnJ8pvGfhDM=; b=MNdmlqhj4/lAh2SirVeQgOXlJbvfFQRFHzRIHcjhHgzpHpdyTK6vEY9MJc2myZ/tZ1 nsmYAV1qxTM/2dos0AgvGAJ2hIseYJ05nYBdzq9Udt98dxViOhRNPAywJXETmHkhrLNV kuSryIUcQTaweXy+7elHFgedOShav1zFXfktasutikZ2Lk+PiVb1Mn+64/7QEyMVp6xj FqzrUuOuq5/6CdLuA+lAwl3PhsoDSdC9PO6jDTPkrghMZTYG7eDvb1VLBGy39iI7p5rZ cc3UVvJvShpbvfuc9O+CK29XAVAXD0Q7nCmITCkuHqIt4aC+x+bQPIirQRnkjJG69Oor eu1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732887250; x=1733492050; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=38ES6p6U/RNwBbX2ML0eCmvdyQ+owZSegnJ8pvGfhDM=; b=kdeqKkRh+Ob9v+xzMbuYUo4oWymPalQtp6l3/eFmDLliI9YT+91NLLTVh8OfBA/XcF czv/wgD6yNjs6afy/TwzEoGltZx04MsEeOWIrlA+vczZy4kRMnFHVWZANeeT4rv4D3GO JY6qOIIEiw1jScFkpmcGrR4X+5hOCqGEruduR4QMIj08jFvGi2P+JD+s5jGK28Eerx3p x47Hl/sQV/5h+OFdMJ4lPXvo86Wb0vC1y5S2FKeTkCuH39luvPRGlcNhMBHzluL/L06j oMe1B8xVFwRCrQabgNnxHcPaviM2FqpokyBOtb9/1J1n1SwCJLosYMzK2sdqJA8Y98oO BlhA== X-Gm-Message-State: AOJu0YwsmfMb0KpMMLf/42WQABOrj6ThZstJP70Mbqw1M24/M73YXSRU 14Qmy2c36M3lFIHQUbsiBhONCgwT0f78xLDUqaGfSjtOPxiA13oqAY19EA== X-Gm-Gg: ASbGncu2zaSoBl8YQVBiSaE/+WXIcLXLq2aD88sYYkaCtGT04iu3BgWXlfnW4huCmYE XrPs56jCozCCBYq2TshegwVsl/OpdsBwDd2vak4B5Qvn4auhxYP2xtzY+CcrWG2Ax3rfHm1YReg kbM4r7j7xh+ETLxs1v3GfJ4sno4dZmm2YX674NlfdN46p3Rbw5ygcdDCQGjiatnZVKqa9tGROJz pDJN/u+pyrngDFUhIaXhBOXwzg6XuxBy+p2HD2/cjeG3RRLjV8eDBUI0QmVIjft X-Google-Smtp-Source: AGHT+IHOz1qZPEbPhhuwpdcP0aCV1vGLD5jwOPXGAYGjNTgadSmm5OYXUDJPCyeCApxhRv1CbEziQg== X-Received: by 2002:a17:907:9494:b0:aa5:63a0:5d97 with SMTP id a640c23a62f3a-aa5945cdbdcmr954977966b.13.1732887249680; Fri, 29 Nov 2024 05:34:09 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa5996c2471sm173996866b.13.2024.11.29.05.34.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:34:09 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 17/18] io_uring/kbuf: use region api for pbuf rings Date: Fri, 29 Nov 2024 13:34:38 +0000 Message-ID: <6c40cf7beaa648558acd4d84bc0fb3279a35d74b.1732886067.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Convert internal parts of the provided buffer ring managment to the region API. It's the last non-region mapped ring we have, so it also kills a bunch of now unused memmap.c helpers. Signed-off-by: Pavel Begunkov --- io_uring/kbuf.c | 170 ++++++++++++++-------------------------------- io_uring/kbuf.h | 18 ++--- io_uring/memmap.c | 118 +++++--------------------------- io_uring/memmap.h | 7 -- 4 files changed, 73 insertions(+), 240 deletions(-) diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c index 644f61445ec9..2dfb9f9419a0 100644 --- a/io_uring/kbuf.c +++ b/io_uring/kbuf.c @@ -351,17 +351,7 @@ static int __io_remove_buffers(struct io_ring_ctx *ctx, if (bl->flags & IOBL_BUF_RING) { i = bl->buf_ring->tail - bl->head; - if (bl->buf_nr_pages) { - int j; - - if (!(bl->flags & IOBL_MMAP)) { - for (j = 0; j < bl->buf_nr_pages; j++) - unpin_user_page(bl->buf_pages[j]); - } - io_pages_unmap(bl->buf_ring, &bl->buf_pages, - &bl->buf_nr_pages, bl->flags & IOBL_MMAP); - bl->flags &= ~IOBL_MMAP; - } + io_free_region(ctx, &bl->region); /* make sure it's seen as empty */ INIT_LIST_HEAD(&bl->buf_list); bl->flags &= ~IOBL_BUF_RING; @@ -614,75 +604,14 @@ int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags) return IOU_OK; } -static int io_pin_pbuf_ring(struct io_uring_buf_reg *reg, - struct io_buffer_list *bl) -{ - struct io_uring_buf_ring *br = NULL; - struct page **pages; - int nr_pages, ret; - - pages = io_pin_pages(reg->ring_addr, - flex_array_size(br, bufs, reg->ring_entries), - &nr_pages); - if (IS_ERR(pages)) - return PTR_ERR(pages); - - br = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL); - if (!br) { - ret = -ENOMEM; - goto error_unpin; - } - -#ifdef SHM_COLOUR - /* - * On platforms that have specific aliasing requirements, SHM_COLOUR - * is set and we must guarantee that the kernel and user side align - * nicely. We cannot do that if IOU_PBUF_RING_MMAP isn't set and - * the application mmap's the provided ring buffer. Fail the request - * if we, by chance, don't end up with aligned addresses. The app - * should use IOU_PBUF_RING_MMAP instead, and liburing will handle - * this transparently. - */ - if ((reg->ring_addr | (unsigned long) br) & (SHM_COLOUR - 1)) { - ret = -EINVAL; - goto error_unpin; - } -#endif - bl->buf_pages = pages; - bl->buf_nr_pages = nr_pages; - bl->buf_ring = br; - bl->flags |= IOBL_BUF_RING; - bl->flags &= ~IOBL_MMAP; - return 0; -error_unpin: - unpin_user_pages(pages, nr_pages); - kvfree(pages); - vunmap(br); - return ret; -} - -static int io_alloc_pbuf_ring(struct io_ring_ctx *ctx, - struct io_uring_buf_reg *reg, - struct io_buffer_list *bl) -{ - size_t ring_size; - - ring_size = reg->ring_entries * sizeof(struct io_uring_buf_ring); - - bl->buf_ring = io_pages_map(&bl->buf_pages, &bl->buf_nr_pages, ring_size); - if (IS_ERR(bl->buf_ring)) { - bl->buf_ring = NULL; - return -ENOMEM; - } - - bl->flags |= (IOBL_BUF_RING | IOBL_MMAP); - return 0; -} - int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg) { struct io_uring_buf_reg reg; struct io_buffer_list *bl, *free_bl = NULL; + struct io_uring_region_desc rd; + struct io_uring_buf_ring *br; + unsigned long mmap_offset; + unsigned long ring_size; int ret; lockdep_assert_held(&ctx->uring_lock); @@ -694,19 +623,8 @@ int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg) return -EINVAL; if (reg.flags & ~(IOU_PBUF_RING_MMAP | IOU_PBUF_RING_INC)) return -EINVAL; - if (!(reg.flags & IOU_PBUF_RING_MMAP)) { - if (!reg.ring_addr) - return -EFAULT; - if (reg.ring_addr & ~PAGE_MASK) - return -EINVAL; - } else { - if (reg.ring_addr) - return -EINVAL; - } - if (!is_power_of_2(reg.ring_entries)) return -EINVAL; - /* cannot disambiguate full vs empty due to head/tail size */ if (reg.ring_entries >= 65536) return -EINVAL; @@ -722,21 +640,47 @@ int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg) return -ENOMEM; } - if (!(reg.flags & IOU_PBUF_RING_MMAP)) - ret = io_pin_pbuf_ring(®, bl); - else - ret = io_alloc_pbuf_ring(ctx, ®, bl); + mmap_offset = reg.bgid << IORING_OFF_PBUF_SHIFT; + ring_size = flex_array_size(br, bufs, reg.ring_entries); - if (!ret) { - bl->nr_entries = reg.ring_entries; - bl->mask = reg.ring_entries - 1; - if (reg.flags & IOU_PBUF_RING_INC) - bl->flags |= IOBL_INC; + memset(&rd, 0, sizeof(rd)); + rd.size = PAGE_ALIGN(ring_size); + if (!(reg.flags & IOU_PBUF_RING_MMAP)) { + rd.user_addr = reg.ring_addr; + rd.flags |= IORING_MEM_REGION_TYPE_USER; + } + ret = io_create_region_mmap_safe(ctx, &bl->region, &rd, mmap_offset); + if (ret) + goto fail; + br = io_region_get_ptr(&bl->region); - io_buffer_add_list(ctx, bl, reg.bgid); - return 0; +#ifdef SHM_COLOUR + /* + * On platforms that have specific aliasing requirements, SHM_COLOUR + * is set and we must guarantee that the kernel and user side align + * nicely. We cannot do that if IOU_PBUF_RING_MMAP isn't set and + * the application mmap's the provided ring buffer. Fail the request + * if we, by chance, don't end up with aligned addresses. The app + * should use IOU_PBUF_RING_MMAP instead, and liburing will handle + * this transparently. + */ + if (!(reg.flags & IOU_PBUF_RING_MMAP) && + ((reg.ring_addr | (unsigned long)br) & (SHM_COLOUR - 1))) { + ret = -EINVAL; + goto fail; } +#endif + bl->nr_entries = reg.ring_entries; + bl->mask = reg.ring_entries - 1; + bl->flags |= IOBL_BUF_RING; + bl->buf_ring = br; + if (reg.flags & IOU_PBUF_RING_INC) + bl->flags |= IOBL_INC; + io_buffer_add_list(ctx, bl, reg.bgid); + return 0; +fail: + io_free_region(ctx, &bl->region); kfree(free_bl); return ret; } @@ -794,32 +738,18 @@ int io_register_pbuf_status(struct io_ring_ctx *ctx, void __user *arg) return 0; } -struct io_buffer_list *io_pbuf_get_bl(struct io_ring_ctx *ctx, - unsigned long bgid) -{ - struct io_buffer_list *bl; - - bl = xa_load(&ctx->io_bl_xa, bgid); - /* must be a mmap'able buffer ring and have pages */ - if (bl && bl->flags & IOBL_MMAP) - return bl; - - return ERR_PTR(-EINVAL); -} - -int io_pbuf_mmap(struct file *file, struct vm_area_struct *vma) +struct io_mapped_region *io_pbuf_get_region(struct io_ring_ctx *ctx, + unsigned int bgid) { - struct io_ring_ctx *ctx = file->private_data; - loff_t pgoff = vma->vm_pgoff << PAGE_SHIFT; struct io_buffer_list *bl; - int bgid; lockdep_assert_held(&ctx->mmap_lock); - bgid = (pgoff & ~IORING_OFF_MMAP_MASK) >> IORING_OFF_PBUF_SHIFT; - bl = io_pbuf_get_bl(ctx, bgid); - if (IS_ERR(bl)) - return PTR_ERR(bl); + bl = xa_load(&ctx->io_bl_xa, bgid); + if (!bl || !(bl->flags & IOBL_BUF_RING)) + return NULL; + if (WARN_ON_ONCE(!io_region_is_set(&bl->region))) + return NULL; - return io_uring_mmap_pages(ctx, vma, bl->buf_pages, bl->buf_nr_pages); + return &bl->region; } diff --git a/io_uring/kbuf.h b/io_uring/kbuf.h index dff7444026a6..bd80c44c5af1 100644 --- a/io_uring/kbuf.h +++ b/io_uring/kbuf.h @@ -3,15 +3,13 @@ #define IOU_KBUF_H #include +#include enum { /* ring mapped provided buffers */ IOBL_BUF_RING = 1, - /* ring mapped provided buffers, but mmap'ed by application */ - IOBL_MMAP = 2, /* buffers are consumed incrementally rather than always fully */ - IOBL_INC = 4, - + IOBL_INC = 2, }; struct io_buffer_list { @@ -21,10 +19,7 @@ struct io_buffer_list { */ union { struct list_head buf_list; - struct { - struct page **buf_pages; - struct io_uring_buf_ring *buf_ring; - }; + struct io_uring_buf_ring *buf_ring; }; __u16 bgid; @@ -35,6 +30,8 @@ struct io_buffer_list { __u16 mask; __u16 flags; + + struct io_mapped_region region; }; struct io_buffer { @@ -81,9 +78,8 @@ void __io_put_kbuf(struct io_kiocb *req, int len, unsigned issue_flags); bool io_kbuf_recycle_legacy(struct io_kiocb *req, unsigned issue_flags); -struct io_buffer_list *io_pbuf_get_bl(struct io_ring_ctx *ctx, - unsigned long bgid); -int io_pbuf_mmap(struct file *file, struct vm_area_struct *vma); +struct io_mapped_region *io_pbuf_get_region(struct io_ring_ctx *ctx, + unsigned int bgid); static inline bool io_kbuf_recycle_ring(struct io_kiocb *req) { diff --git a/io_uring/memmap.c b/io_uring/memmap.c index 73b73f4ea1bd..6d8a98bd9cac 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -36,90 +36,6 @@ static void *io_mem_alloc_compound(struct page **pages, int nr_pages, return page_address(page); } -static void *io_mem_alloc_single(struct page **pages, int nr_pages, size_t size, - gfp_t gfp) -{ - void *ret; - int i; - - for (i = 0; i < nr_pages; i++) { - pages[i] = alloc_page(gfp); - if (!pages[i]) - goto err; - } - - ret = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL); - if (ret) - return ret; -err: - while (i--) - put_page(pages[i]); - return ERR_PTR(-ENOMEM); -} - -void *io_pages_map(struct page ***out_pages, unsigned short *npages, - size_t size) -{ - gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_ZERO | __GFP_NOWARN; - struct page **pages; - int nr_pages; - void *ret; - - nr_pages = (size + PAGE_SIZE - 1) >> PAGE_SHIFT; - pages = kvmalloc_array(nr_pages, sizeof(struct page *), gfp); - if (!pages) - return ERR_PTR(-ENOMEM); - - ret = io_mem_alloc_compound(pages, nr_pages, size, gfp); - if (!IS_ERR(ret)) - goto done; - if (nr_pages == 1) - goto fail; - - ret = io_mem_alloc_single(pages, nr_pages, size, gfp); - if (!IS_ERR(ret)) { -done: - *out_pages = pages; - *npages = nr_pages; - return ret; - } -fail: - kvfree(pages); - *out_pages = NULL; - *npages = 0; - return ret; -} - -void io_pages_unmap(void *ptr, struct page ***pages, unsigned short *npages, - bool put_pages) -{ - bool do_vunmap = false; - - if (!ptr) - return; - - if (put_pages && *npages) { - struct page **to_free = *pages; - int i; - - /* - * Only did vmap for the non-compound multiple page case. - * For the compound page, we just need to put the head. - */ - if (PageCompound(to_free[0])) - *npages = 1; - else if (*npages > 1) - do_vunmap = true; - for (i = 0; i < *npages; i++) - put_page(to_free[i]); - } - if (do_vunmap) - vunmap(ptr); - kvfree(*pages); - *pages = NULL; - *npages = 0; -} - struct page **io_pin_pages(unsigned long uaddr, unsigned long len, int *npages) { unsigned long start, end, nr_pages; @@ -374,16 +290,14 @@ static void *io_uring_validate_mmap_request(struct file *file, loff_t pgoff, return ERR_PTR(-EFAULT); return ctx->sq_sqes; case IORING_OFF_PBUF_RING: { - struct io_buffer_list *bl; + struct io_mapped_region *region; unsigned int bgid; - void *ptr; bgid = (offset & ~IORING_OFF_MMAP_MASK) >> IORING_OFF_PBUF_SHIFT; - bl = io_pbuf_get_bl(ctx, bgid); - if (IS_ERR(bl)) - return bl; - ptr = bl->buf_ring; - return ptr; + region = io_pbuf_get_region(ctx, bgid); + if (!region) + return ERR_PTR(-EINVAL); + return io_region_validate_mmap(ctx, region); } case IORING_MAP_OFF_PARAM_REGION: return io_region_validate_mmap(ctx, &ctx->param_region); @@ -392,15 +306,6 @@ static void *io_uring_validate_mmap_request(struct file *file, loff_t pgoff, return ERR_PTR(-EINVAL); } -int io_uring_mmap_pages(struct io_ring_ctx *ctx, struct vm_area_struct *vma, - struct page **pages, int npages) -{ - unsigned long nr_pages = npages; - - vm_flags_set(vma, VM_DONTEXPAND); - return vm_insert_pages(vma, vma->vm_start, pages, &nr_pages); -} - #ifdef CONFIG_MMU static int io_region_mmap(struct io_ring_ctx *ctx, @@ -435,8 +340,17 @@ __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) return io_region_mmap(ctx, &ctx->ring_region, vma, page_limit); case IORING_OFF_SQES: return io_region_mmap(ctx, &ctx->sq_region, vma, UINT_MAX); - case IORING_OFF_PBUF_RING: - return io_pbuf_mmap(file, vma); + case IORING_OFF_PBUF_RING: { + struct io_mapped_region *region; + unsigned int bgid; + + bgid = (offset & ~IORING_OFF_MMAP_MASK) >> IORING_OFF_PBUF_SHIFT; + region = io_pbuf_get_region(ctx, bgid); + if (!region) + return -EINVAL; + + return io_region_mmap(ctx, region, vma, UINT_MAX); + } case IORING_MAP_OFF_PARAM_REGION: return io_region_mmap(ctx, &ctx->param_region, vma, UINT_MAX); } diff --git a/io_uring/memmap.h b/io_uring/memmap.h index 7395996eb353..c898dcba2b4e 100644 --- a/io_uring/memmap.h +++ b/io_uring/memmap.h @@ -4,13 +4,6 @@ #define IORING_MAP_OFF_PARAM_REGION 0x20000000ULL struct page **io_pin_pages(unsigned long ubuf, unsigned long len, int *npages); -int io_uring_mmap_pages(struct io_ring_ctx *ctx, struct vm_area_struct *vma, - struct page **pages, int npages); - -void *io_pages_map(struct page ***out_pages, unsigned short *npages, - size_t size); -void io_pages_unmap(void *ptr, struct page ***pages, unsigned short *npages, - bool put_pages); #ifndef CONFIG_MMU unsigned int io_uring_nommu_mmap_capabilities(struct file *file); From patchwork Fri Nov 29 13:34:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13888705 Received: from mail-ej1-f50.google.com (mail-ej1-f50.google.com [209.85.218.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 090E019CC27 for ; Fri, 29 Nov 2024 13:34:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887254; cv=none; b=hmt9WsRpNbOoOtDpiArK6kudsnjtAKuP5yKVy3dGt8DL0L9N/4/mQ2384I5SVxb5uxH3UTB1N5wD7OVhlxIb7wYNQr/hYFoq0yE/1/d1nsQZVe2UPIKA9OaLBtz4wjDzXshbYMFqtuE8dKNI5GrjGQxUFQOMbCKXOVcmlvGBWaY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732887254; c=relaxed/simple; bh=OwrKJ3LmdCtYc4BXuj3h5dzn1NBkc6lMUM2lySWds5c=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=tbYNEAHlskciZ3Sd0Bfxyq7l2UKBidejqZxqFI9DgkSI5Xnbavkzy7w6IzvmTl+NuhZOhBisixwD9oUWeQrTMkY915Bxm2UJxAgaE8pIAy4u6HG0hMSoFXuFR1gnRNgRqnIuJdVjiwYR/rMpKCL6M3yJxXN/J5silY37RyhxlWo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=B+BEPJuq; arc=none smtp.client-ip=209.85.218.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="B+BEPJuq" Received: by mail-ej1-f50.google.com with SMTP id a640c23a62f3a-aa51b8c5f4dso248433366b.2 for ; Fri, 29 Nov 2024 05:34:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732887251; x=1733492051; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vJTYzsUPh3C46ZWrdJuDJ1PJh9BVCFJmJ/mZBbSroxA=; b=B+BEPJuqDsCi5xBj/o2tbkjJvcbeL1K0I0y+g9GFKm7Af6purpLYN8sZg8f7/M/OMV hbQhzxolQ1Tuwfv7gNHpy2FitmaBRbUkiyyBT+vKdA+Ah9K1Fvj0r1as4OBjYHxEkm6E qqJ+OGsZbIk7snANKLEf0vhH2PsWr8Wix9epq975laRTtk+DZtHjwuaxnbSeyTyz0PVQ GmvUEF+dTvj9Hoa7u7j19EpE14pYkdLxplupDKkHVSOljKmRgP2yQiEes1BAHXU6PqgV CN1fNOwhWuN75upikFEqiDnj/ES+ClAaQ+8q2BTbRsFRPlkZCxWVedhFy5W8DgDnIi5b Kp3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732887251; x=1733492051; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vJTYzsUPh3C46ZWrdJuDJ1PJh9BVCFJmJ/mZBbSroxA=; b=qaE8uNa7OzHXMXdfCB6Q8aCwl55r3Vitwbrzj+WDIDrbfJ51NcILEbg9NcjEJNK4A5 TnqHvvBF3R5fnUd7jt049JVp7LvPMsCj5hPDrz3Ls5DSIV+DHPG7UKpVz+0pcduBWZU4 pj/mCgU1YGnB58Sc5S8ke39Pt1UxGaNPLKIdEinDf+wQvcARx0qVNmF9RFNy+SI5tPX5 1djzg96NCk78+Qtc0fR5lz4anuWVHxk7QPv/2bAsmA7MjY2ofXuUjRuB+MZ2VXNlZs9t /ANlk60vaYaKPDcANnPJSmJWzpq1AlsbrH5N/cUM4QIgHxgnku/XiLHTAbkNk3vcoh8W Ee0A== X-Gm-Message-State: AOJu0Yzzmtpl05wkINQ+RoqQVSgi53gd2TrlqZ8ur6q70E9vw+FzPXIh GbsvSq9j+lxMvyi6RNc6F5oaZ0vb1fRUtIjgubm678V95mM3wVEUyDBpRw== X-Gm-Gg: ASbGnctRElkHDgMEd/VTI5CDwvLCzeth6aJKLo7MBruB+ZzqVHGTfzOdTWovpB1EWXB Iix3A1lNYdevdzEvGevL5nydm8bNX0oWl1QIkJhVx6MqeiFcw3LFdNQYuo2d101Oc4J+tkBAu3R 1Shkx+6A/BJytqQXkkzdvwWCiAlsQ9jKanJt98nU+wXwCanhzaR8wE5q/ns9mFMdSY8whA1Q2P2 ge2RBKv2hPE11BbWpTCzZst36P8VJ0k7dBHO+8eeKHHapNZvuOCc69d5K7GpD+O X-Google-Smtp-Source: AGHT+IGam5cvw79/xPeSq5Xvfj3nUPVd4iz3o2K7s2sp7OVQth96hOXHQBgUn7j7xtvkLtzSEEosHQ== X-Received: by 2002:a17:907:770a:b0:aa5:26ac:18d6 with SMTP id a640c23a62f3a-aa58102872cmr945963866b.43.1732887250237; Fri, 29 Nov 2024 05:34:10 -0800 (PST) Received: from 127.0.0.1localhost ([163.114.131.193]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aa5996c2471sm173996866b.13.2024.11.29.05.34.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:34:09 -0800 (PST) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com Subject: [PATCH v3 18/18] io_uring/memmap: unify io_uring mmap'ing code Date: Fri, 29 Nov 2024 13:34:39 +0000 Message-ID: X-Mailer: git-send-email 2.47.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: io-uring@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 All mapped memory is now backed by regions and we can unify and clean up io_region_validate_mmap() and io_uring_mmap(). Extract a function looking up a region, the rest of the handling should be generic and just needs the region. There is one more ring type specific code, i.e. the mmaping size truncation quirk for IORING_OFF_[S,C]Q_RING, which is left as is. Signed-off-by: Pavel Begunkov --- io_uring/kbuf.c | 3 -- io_uring/memmap.c | 81 ++++++++++++++++++----------------------------- 2 files changed, 31 insertions(+), 53 deletions(-) diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c index 2dfb9f9419a0..e91260a6156b 100644 --- a/io_uring/kbuf.c +++ b/io_uring/kbuf.c @@ -748,8 +748,5 @@ struct io_mapped_region *io_pbuf_get_region(struct io_ring_ctx *ctx, bl = xa_load(&ctx->io_bl_xa, bgid); if (!bl || !(bl->flags & IOBL_BUF_RING)) return NULL; - if (WARN_ON_ONCE(!io_region_is_set(&bl->region))) - return NULL; - return &bl->region; } diff --git a/io_uring/memmap.c b/io_uring/memmap.c index 6d8a98bd9cac..dda846190fbd 100644 --- a/io_uring/memmap.c +++ b/io_uring/memmap.c @@ -254,6 +254,27 @@ int io_create_region_mmap_safe(struct io_ring_ctx *ctx, struct io_mapped_region return 0; } +static struct io_mapped_region *io_mmap_get_region(struct io_ring_ctx *ctx, + loff_t pgoff) +{ + loff_t offset = pgoff << PAGE_SHIFT; + unsigned int bgid; + + switch (offset & IORING_OFF_MMAP_MASK) { + case IORING_OFF_SQ_RING: + case IORING_OFF_CQ_RING: + return &ctx->ring_region; + case IORING_OFF_SQES: + return &ctx->sq_region; + case IORING_OFF_PBUF_RING: + bgid = (offset & ~IORING_OFF_MMAP_MASK) >> IORING_OFF_PBUF_SHIFT; + return io_pbuf_get_region(ctx, bgid); + case IORING_MAP_OFF_PARAM_REGION: + return &ctx->param_region; + } + return NULL; +} + static void *io_region_validate_mmap(struct io_ring_ctx *ctx, struct io_mapped_region *mr) { @@ -271,39 +292,12 @@ static void *io_uring_validate_mmap_request(struct file *file, loff_t pgoff, size_t sz) { struct io_ring_ctx *ctx = file->private_data; - loff_t offset = pgoff << PAGE_SHIFT; + struct io_mapped_region *region; - switch ((pgoff << PAGE_SHIFT) & IORING_OFF_MMAP_MASK) { - case IORING_OFF_SQ_RING: - case IORING_OFF_CQ_RING: - /* Don't allow mmap if the ring was setup without it */ - if (ctx->flags & IORING_SETUP_NO_MMAP) - return ERR_PTR(-EINVAL); - if (!ctx->rings) - return ERR_PTR(-EFAULT); - return ctx->rings; - case IORING_OFF_SQES: - /* Don't allow mmap if the ring was setup without it */ - if (ctx->flags & IORING_SETUP_NO_MMAP) - return ERR_PTR(-EINVAL); - if (!ctx->sq_sqes) - return ERR_PTR(-EFAULT); - return ctx->sq_sqes; - case IORING_OFF_PBUF_RING: { - struct io_mapped_region *region; - unsigned int bgid; - - bgid = (offset & ~IORING_OFF_MMAP_MASK) >> IORING_OFF_PBUF_SHIFT; - region = io_pbuf_get_region(ctx, bgid); - if (!region) - return ERR_PTR(-EINVAL); - return io_region_validate_mmap(ctx, region); - } - case IORING_MAP_OFF_PARAM_REGION: - return io_region_validate_mmap(ctx, &ctx->param_region); - } - - return ERR_PTR(-EINVAL); + region = io_mmap_get_region(ctx, pgoff); + if (!region) + return ERR_PTR(-EINVAL); + return io_region_validate_mmap(ctx, region); } #ifdef CONFIG_MMU @@ -324,7 +318,8 @@ __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) struct io_ring_ctx *ctx = file->private_data; size_t sz = vma->vm_end - vma->vm_start; long offset = vma->vm_pgoff << PAGE_SHIFT; - unsigned int page_limit; + unsigned int page_limit = UINT_MAX; + struct io_mapped_region *region; void *ptr; guard(mutex)(&ctx->mmap_lock); @@ -337,25 +332,11 @@ __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma) case IORING_OFF_SQ_RING: case IORING_OFF_CQ_RING: page_limit = (sz + PAGE_SIZE - 1) >> PAGE_SHIFT; - return io_region_mmap(ctx, &ctx->ring_region, vma, page_limit); - case IORING_OFF_SQES: - return io_region_mmap(ctx, &ctx->sq_region, vma, UINT_MAX); - case IORING_OFF_PBUF_RING: { - struct io_mapped_region *region; - unsigned int bgid; - - bgid = (offset & ~IORING_OFF_MMAP_MASK) >> IORING_OFF_PBUF_SHIFT; - region = io_pbuf_get_region(ctx, bgid); - if (!region) - return -EINVAL; - - return io_region_mmap(ctx, region, vma, UINT_MAX); - } - case IORING_MAP_OFF_PARAM_REGION: - return io_region_mmap(ctx, &ctx->param_region, vma, UINT_MAX); + break; } - return -EINVAL; + region = io_mmap_get_region(ctx, vma->vm_pgoff); + return io_region_mmap(ctx, region, vma, page_limit); } unsigned long io_uring_get_unmapped_area(struct file *filp, unsigned long addr,