From patchwork Wed Jan 15 13:35:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13940444 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D4F7C02183 for ; Wed, 15 Jan 2025 13:36:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 90CA86B0085; Wed, 15 Jan 2025 08:36:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 86B6D6B0083; Wed, 15 Jan 2025 08:36:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 670B3280001; Wed, 15 Jan 2025 08:36:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 43F136B0082 for ; Wed, 15 Jan 2025 08:36:10 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id D80721A12CC for ; Wed, 15 Jan 2025 13:36:09 +0000 (UTC) X-FDA: 83009784858.15.4D4DE62 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf02.hostedemail.com (Postfix) with ESMTP id E2B3580013 for ; Wed, 15 Jan 2025 13:36:07 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="jEzFa85/"; spf=pass (imf02.hostedemail.com: domain of 3xrmHZwkKCJs5GD79MTCGBJJBG9.7JHGDIPS-HHFQ57F.JMB@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3xrmHZwkKCJs5GD79MTCGBJJBG9.7JHGDIPS-HHFQ57F.JMB@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736948168; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bQZ8lzcN1AuRMo71v8cGYFa8tBipKd/jkKd39EV9lsM=; b=mTPjV5mfr/KUTFWZALelPHTk1QOeCUbmXGeHy1lIL5P06KxkJVal6n0zhbhBk02YG5DcCW OMWqTH36t4+JljB9KtPjre5of2z6VWByKAa+pxup/8faO62SoUjLHkL7QP2RgOPMsnxhHs 0L3W5MM/3ymc677/BjQW8F1jBe14D6s= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736948168; a=rsa-sha256; cv=none; b=a9EgXiMRwPWWyAN+NFNkGL/tqldDIXmYy7draUDGBxLaLZfXsHTDXc708FBKfA7m3gLx6l YfJhBExeiCJ8q/oSROzws1fZSVEopmVXsv/+WBMC2ccu9nC+1FGvWJ/ZS72DfeuXTd5TyJ QEVjUDJFJinl6Cg3VCuWJYAyB5HC8jc= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="jEzFa85/"; spf=pass (imf02.hostedemail.com: domain of 3xrmHZwkKCJs5GD79MTCGBJJBG9.7JHGDIPS-HHFQ57F.JMB@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3xrmHZwkKCJs5GD79MTCGBJJBG9.7JHGDIPS-HHFQ57F.JMB@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-4359206e1e4so28585115e9.2 for ; Wed, 15 Jan 2025 05:36:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736948166; x=1737552966; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=bQZ8lzcN1AuRMo71v8cGYFa8tBipKd/jkKd39EV9lsM=; b=jEzFa85/NCNIS6Ghrs8En0Y6ptqAs/MPahzQwvX/+9qc4jsw0dh9bUEfdRgNR9+/70 XkunPuOzLW681xj60KTWpZsvQqOpO8cqNdD25vcr93MgvTyUnOjdZk/Dn0RgdTyzMg7V j6wFO2A8X1/pDZKAHMk6c5wec4kC/VIlTdiHQFcH3dDRvS1ck6ScC4qPQsoGEAg3+ABh hP3b8PZx2/ZWLdFfctXxLDwN0ybHXlJyLSp26rq1LOqWHhc13SB4GdZAN1T/P9yPCocP eJwV1DG8vrzdj6eNp8JA7aKmpdcMeZC0+1oKIX7jaw+MWYxpSQeV3ttBJVztlAyaK4pi mXWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736948166; x=1737552966; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=bQZ8lzcN1AuRMo71v8cGYFa8tBipKd/jkKd39EV9lsM=; b=Mp8jsG5MzaFCHFmrT9A1YS3d8kyRITa/+kInr38AS5Y1O7hjZ53UU4sbZlAsF5akn/ W+lw69hBZyjqgjwD4f/k9W0hqFTi7XiY2KtFC2snWDf1FDdRnMwZXawX10jZLmhuRvp7 bIGOkyJqpst1iCqT3LztqDRQ6jkET/nZXWCnMCYfFblD1VhGXed7p+NdM7oERNFOIIBk U5ATn+3z2I5LLZ3vglkkVaq992Vb43aNl7ZAaK89NTAafFMyrv6sH5pCsdS99CvGcer6 vbElZ1u0mC/WnkYToik65UGQwphvzsk1vtuiI9NJI/4owm/uBGZF011wuU54SUyEjnuu 48Pg== X-Forwarded-Encrypted: i=1; AJvYcCWtblIyRuH7sUK3I304ik7u44uxCOIJGlOkS6X15AQ1EQbX8lyvGowCj1p6rl32Y2COTrVV2rfyZw==@kvack.org X-Gm-Message-State: AOJu0YxeeKfMehLG1IKKrFNUIKwa/DwTj+KbNaI9ZS+yKlXMDVkXAkbE G6/ezl7jqCucRq6TTTHYKtCT1nc7qIF5trCJ+VykOa1M695Tu9qCw6WbBpDmXwE/sBtV0JNIAKl s+/0344l5ejDsqQ== X-Google-Smtp-Source: AGHT+IHXOHLVyrqG26bwXmWj7k8i+YHFx0v6XbhRNOb82HiwvIAwHIR5ct4HbEiTS1Vq5WpHBB+epRkGt5G9PeY= X-Received: from wmbfj4.prod.google.com ([2002:a05:600c:c84:b0:431:1903:8a3e]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:a09:b0:434:f753:6012 with SMTP id 5b1f17b1804b1-436e26aa593mr296902835e9.17.1736948166660; Wed, 15 Jan 2025 05:36:06 -0800 (PST) Date: Wed, 15 Jan 2025 13:35:04 +0000 In-Reply-To: <20250115-vma-v12-0-375099ae017a@google.com> Mime-Version: 1.0 References: <20250115-vma-v12-0-375099ae017a@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=10172; i=aliceryhl@google.com; h=from:subject:message-id; bh=JTL2MUjztUWi2Edxe76W3MQbrbIY2frYdACvccKeGlM=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnh7m9buKTHoMGL2rWWFPKx8ZKzp9gRDt5AGFFP Y9srGjZLTmJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ4e5vQAKCRAEWL7uWMY5 RrciD/9qxvpMIi1Wfr9OCenml8/BizGBow4wxZKmGcU7Jf5ARM1DQFn7IAVl2OWQg5k+XXqWISD abUyA9sgSCR3fNBfQUlzc9oPDc8VFloaJX1eDitNJ9UuTGDQOdhqFnbHGSUBCHB7/KKwAKEFhPh Se8GxsL9Uy7GhqQr0DAXt0+LNvrthD8VFVPhUb81uxJ1NLWOwlKYLvmtfwDcvT7/18jAOYJ3o+3 LLBczHW/HHX51sFejN8tdnD8Uau2symCRFRznOxHVlepzvqs1L7DFqPa7Unq3T2n3WwbbihmiYW uF3bJ/iOXRuVYj3XuypBdZANoSg/vxZWaVkesGh1FepGgsVbNLMeANGCWWRBe5UmbzdvyXJXIFw r1eOgK26fKBljBTYe8OVeUBy2cJsr4dloKg1X8C9Gk2rgbOgQUqjDPqPVJ3CpqC1JmKDNfFs0jD N39u+9WgudH88D6MTFZ2ma9Ao3H3iS68gVHpLOIySe/rdlfB8fx/UKb0rMY0OzQA21lUL0PovdC TmyxJ9PQuXkTNQA0lOsDivCPLK9tSigtvjhqidQheZvMfeHeHagyWNoCUN4MieUKecVaLleVUsa YTtPtZwCxxq1ZzXJb7peMH4jnSX8yZe/lsBrQND4q5zTf62A+4jXJWVPaG19B4zwHCzbGpXPm0X LkC4SaR2OVHrQQQ== X-Mailer: b4 0.13.0 Message-ID: <20250115-vma-v12-1-375099ae017a@google.com> Subject: [PATCH v12 1/8] mm: rust: add abstraction for struct mm_struct From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: E2B3580013 X-Stat-Signature: iyju7y56bwodrq9naiesbud9se3hkox5 X-Rspam-User: X-HE-Tag: 1736948167-176146 X-HE-Meta: U2FsdGVkX1+4P6IZoeu7I66ovfUZdtxN+Ndz1j6Wpj34RPuzPxC9Kgq1Kz/wmZ1Fj1IaDsrL3eXPUygEZMEKQvlCdhvWjofJl4cYIZKkC4Trkr0n0RBF9z09vghLt4O31783rT38oOpIAwFDngeEWBx7g+nOwritVXPjf7BxOV2/F0ukv370H4Rsy9Def2K+7neLsnfU+ZN11a3iaqHoHh0o4vq9a2/ebbFVNbKCGLrGxfyHIEMJmzw2JmgkMMceggkvJUPO7bYpKVfxWIR/Xg8gYZNqVWEV2AzVritpk+3WifQHPlOljoD1SZB1Xu8s0pfks1G4dlXWT3tB4WzUooQQRXI3XiH0kKhrlToU/v18e5KO+YdyvSHad/IEIsmn3CAdkmJenAW0nD9Wna4rkh/Krof2vy58BcRVLOUGErOs0NWEtEPnQu1ILP6+Eyponps0WUuDQwfu0wgtN9hSgp+l7pQtCPcwfnHBcRY/lEzVRkj4CTwtB7nVrQTyhGrn7RE3BnzPBEBUOAv/kiYIhivyrm4TEEp8GsKof5QF8o5J9BBi5vrZqGVS2zDBAJPdf7Tt0pF6/v8EwOf/ZSlg2USQ7pka7BfExH4jtkqIUiEb4BZYcx+IX9W+tc1M5Pg49yAqLN4go+jzDIVCz87IEwLbWs811tZSi17xuPlm7FXFYHZGYcephzbjX5Dt+wUwl+j6k/MSsE89SIoeWgwd2LWG8um/M1LCayTTc762pM/L8ApGs8bxFSdZCq+Rrm4jQ/amKxfldZOp6dsDxijbJ2YCwtzKG+T+hM3JOmHG6UJbAaAAqRlzaz3sRCcw57oNi2IjIZ8Mge91PhAkHwsU9N0qLr74+3cWyZDk2PtaJfIAgCj0DdO78/eHSQV49otZE6mWUb4ZoBqlLR2KiSjMEC6WrsmML2M5WAIKkxs1uwAGCyRn5vuEwkm3sQ83GuQ6FdZ8oWWRA5Ckqb4x5+t 6D25d6Hk E1Y0vxCMEZkJ4qko0pitJzOAegfcJJokLs1FktjVyweb6uD9HsU1Tke1h88fWSJhTgQdoJsjBke8fkSNs1bFNN1opGON5RS7IEl04aoaeifXUBJ32R+Z0ZW6bt4diYMpjXnZ+f0vCeeESiCxcY8slBreWx2PhTMwSBMNpUoRjfp3AoDiH2ZEjcryI/Ygpu9WRtlNrXK0Po5EDxOMdKpVVVC9jQoFIztEkWclwjGgChsaaFGirS+GIKa4fBLpb6CSC6hNAIsHhzgblJ7s0QwdmtNr1M85GfOnpfKxTrWNtknObCGSUTVepAQDcu2jW+QXg1X0dfuepg8vytWmZbxxPNUgRuASKF/IC6sOfM60m4LNu9H10NSsHhIwkPSlrTXqq6UPXNvgQThf9Kq/e4JMptbB16s5yNaY8XUtE6kdM+MO5Yt16OabhyfLmgYcX8x3TCDJHHRtxpo2HeNsPneo7UpqoFTLevjk/VCYcN33r90Lk0JNLWjNyKUI2ODX2AxNP9qra6VSz6Xsi3F4i7ZZHQwBsJE3p/oBAUWdRHI3iuyoGEKmsmcrRwt/CFl9n9HEC4OQPwx/wkf2MS/tAs5ZQbce1XqmY6mn1Aox+O/eBKpXf0xFk4kx8z1Bb2DMi6lFYNdo1mw5VXqlROxovP5YIq2CodgQrlM/PfO7/k4xVGiaBGTHgypByv9TvgPimjJ8ujUrOQfcjZVMsx1M= X-Bogosity: Unsure, tests=bogofilter, spamicity=0.474264, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: These abstractions allow you to reference a `struct mm_struct` using both mmgrab and mmget refcounts. This is done using two Rust types: * Mm - represents an mm_struct where you don't know anything about the value of mm_users. * MmWithUser - represents an mm_struct where you know at compile time that mm_users is non-zero. This allows us to encode in the type system whether a method requires that mm_users is non-zero or not. For instance, you can always call `mmget_not_zero` but you can only call `mmap_read_lock` when mm_users is non-zero. The struct is called Mm to keep consistency with the C side. The ability to obtain `current->mm` is added later in this series. Acked-by: Lorenzo Stoakes (for mm bits) Signed-off-by: Alice Ryhl --- rust/helpers/helpers.c | 1 + rust/helpers/mm.c | 39 +++++++++ rust/kernel/lib.rs | 1 + rust/kernel/mm.rs | 209 +++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 250 insertions(+) diff --git a/rust/helpers/helpers.c b/rust/helpers/helpers.c index dcf827a61b52..9d748ec845b3 100644 --- a/rust/helpers/helpers.c +++ b/rust/helpers/helpers.c @@ -16,6 +16,7 @@ #include "fs.c" #include "jump_label.c" #include "kunit.c" +#include "mm.c" #include "mutex.c" #include "page.c" #include "pid_namespace.c" diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c new file mode 100644 index 000000000000..7201747a5d31 --- /dev/null +++ b/rust/helpers/mm.c @@ -0,0 +1,39 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include + +void rust_helper_mmgrab(struct mm_struct *mm) +{ + mmgrab(mm); +} + +void rust_helper_mmdrop(struct mm_struct *mm) +{ + mmdrop(mm); +} + +void rust_helper_mmget(struct mm_struct *mm) +{ + mmget(mm); +} + +bool rust_helper_mmget_not_zero(struct mm_struct *mm) +{ + return mmget_not_zero(mm); +} + +void rust_helper_mmap_read_lock(struct mm_struct *mm) +{ + mmap_read_lock(mm); +} + +bool rust_helper_mmap_read_trylock(struct mm_struct *mm) +{ + return mmap_read_trylock(mm); +} + +void rust_helper_mmap_read_unlock(struct mm_struct *mm) +{ + mmap_read_unlock(mm); +} diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs index e1065a7551a3..6555e0847192 100644 --- a/rust/kernel/lib.rs +++ b/rust/kernel/lib.rs @@ -46,6 +46,7 @@ pub mod kunit; pub mod list; pub mod miscdevice; +pub mod mm; #[cfg(CONFIG_NET)] pub mod net; pub mod page; diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs new file mode 100644 index 000000000000..2fb5f440af60 --- /dev/null +++ b/rust/kernel/mm.rs @@ -0,0 +1,209 @@ +// SPDX-License-Identifier: GPL-2.0 + +// Copyright (C) 2024 Google LLC. + +//! Memory management. +//! +//! This module deals with managing the address space of userspace processes. Each process has an +//! instance of [`Mm`], which keeps track of multiple VMAs (virtual memory areas). Each VMA +//! corresponds to a region of memory that the userspace process can access, and the VMA lets you +//! control what happens when userspace reads or writes to that region of memory. +//! +//! C header: [`include/linux/mm.h`](srctree/include/linux/mm.h) + +use crate::{ + bindings, + types::{ARef, AlwaysRefCounted, NotThreadSafe, Opaque}, +}; +use core::{ops::Deref, ptr::NonNull}; + +/// A wrapper for the kernel's `struct mm_struct`. +/// +/// This represents the address space of a userspace process, so each process has one `Mm` +/// instance. It may hold many VMAs internally. +/// +/// There is a counter called `mm_users` that counts the users of the address space; this includes +/// the userspace process itself, but can also include kernel threads accessing the address space. +/// Once `mm_users` reaches zero, this indicates that the address space can be destroyed. To access +/// the address space, you must prevent `mm_users` from reaching zero while you are accessing it. +/// The [`MmWithUser`] type represents an address space where this is guaranteed, and you can +/// create one using [`mmget_not_zero`]. +/// +/// The `ARef` smart pointer holds an `mmgrab` refcount. Its destructor may sleep. +/// +/// # Invariants +/// +/// Values of this type are always refcounted using `mmgrab`. +/// +/// [`mmget_not_zero`]: Mm::mmget_not_zero +#[repr(transparent)] +pub struct Mm { + mm: Opaque, +} + +// SAFETY: It is safe to call `mmdrop` on another thread than where `mmgrab` was called. +unsafe impl Send for Mm {} +// SAFETY: All methods on `Mm` can be called in parallel from several threads. +unsafe impl Sync for Mm {} + +// SAFETY: By the type invariants, this type is always refcounted. +unsafe impl AlwaysRefCounted for Mm { + #[inline] + fn inc_ref(&self) { + // SAFETY: The pointer is valid since self is a reference. + unsafe { bindings::mmgrab(self.as_raw()) }; + } + + #[inline] + unsafe fn dec_ref(obj: NonNull) { + // SAFETY: The caller is giving up their refcount. + unsafe { bindings::mmdrop(obj.cast().as_ptr()) }; + } +} + +/// A wrapper for the kernel's `struct mm_struct`. +/// +/// This type is like [`Mm`], but with non-zero `mm_users`. It can only be used when `mm_users` can +/// be proven to be non-zero at compile-time, usually because the relevant code holds an `mmget` +/// refcount. It can be used to access the associated address space. +/// +/// The `ARef` smart pointer holds an `mmget` refcount. Its destructor may sleep. +/// +/// # Invariants +/// +/// Values of this type are always refcounted using `mmget`. The value of `mm_users` is non-zero. +#[repr(transparent)] +pub struct MmWithUser { + mm: Mm, +} + +// SAFETY: It is safe to call `mmput` on another thread than where `mmget` was called. +unsafe impl Send for MmWithUser {} +// SAFETY: All methods on `MmWithUser` can be called in parallel from several threads. +unsafe impl Sync for MmWithUser {} + +// SAFETY: By the type invariants, this type is always refcounted. +unsafe impl AlwaysRefCounted for MmWithUser { + #[inline] + fn inc_ref(&self) { + // SAFETY: The pointer is valid since self is a reference. + unsafe { bindings::mmget(self.as_raw()) }; + } + + #[inline] + unsafe fn dec_ref(obj: NonNull) { + // SAFETY: The caller is giving up their refcount. + unsafe { bindings::mmput(obj.cast().as_ptr()) }; + } +} + +// Make all `Mm` methods available on `MmWithUser`. +impl Deref for MmWithUser { + type Target = Mm; + + #[inline] + fn deref(&self) -> &Mm { + &self.mm + } +} + +// These methods are safe to call even if `mm_users` is zero. +impl Mm { + /// Returns a raw pointer to the inner `mm_struct`. + #[inline] + pub fn as_raw(&self) -> *mut bindings::mm_struct { + self.mm.get() + } + + /// Obtain a reference from a raw pointer. + /// + /// # Safety + /// + /// The caller must ensure that `ptr` points at an `mm_struct`, and that it is not deallocated + /// during the lifetime 'a. + #[inline] + pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a Mm { + // SAFETY: Caller promises that the pointer is valid for 'a. Layouts are compatible due to + // repr(transparent). + unsafe { &*ptr.cast() } + } + + /// Calls `mmget_not_zero` and returns a handle if it succeeds. + #[inline] + pub fn mmget_not_zero(&self) -> Option> { + // SAFETY: The pointer is valid since self is a reference. + let success = unsafe { bindings::mmget_not_zero(self.as_raw()) }; + + if success { + // SAFETY: We just created an `mmget` refcount. + Some(unsafe { ARef::from_raw(NonNull::new_unchecked(self.as_raw().cast())) }) + } else { + None + } + } +} + +// These methods require `mm_users` to be non-zero. +impl MmWithUser { + /// Obtain a reference from a raw pointer. + /// + /// # Safety + /// + /// The caller must ensure that `ptr` points at an `mm_struct`, and that `mm_users` remains + /// non-zero for the duration of the lifetime 'a. + #[inline] + pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser { + // SAFETY: Caller promises that the pointer is valid for 'a. The layout is compatible due + // to repr(transparent). + unsafe { &*ptr.cast() } + } + + /// Lock the mmap read lock. + #[inline] + pub fn mmap_read_lock(&self) -> MmapReadGuard<'_> { + // SAFETY: The pointer is valid since self is a reference. + unsafe { bindings::mmap_read_lock(self.as_raw()) }; + + // INVARIANT: We just acquired the read lock. + MmapReadGuard { + mm: self, + _nts: NotThreadSafe, + } + } + + /// Try to lock the mmap read lock. + #[inline] + pub fn mmap_read_trylock(&self) -> Option> { + // SAFETY: The pointer is valid since self is a reference. + let success = unsafe { bindings::mmap_read_trylock(self.as_raw()) }; + + if success { + // INVARIANT: We just acquired the read lock. + Some(MmapReadGuard { + mm: self, + _nts: NotThreadSafe, + }) + } else { + None + } + } +} + +/// A guard for the mmap read lock. +/// +/// # Invariants +/// +/// This `MmapReadGuard` guard owns the mmap read lock. +pub struct MmapReadGuard<'a> { + mm: &'a MmWithUser, + // `mmap_read_lock` and `mmap_read_unlock` must be called on the same thread + _nts: NotThreadSafe, +} + +impl Drop for MmapReadGuard<'_> { + #[inline] + fn drop(&mut self) { + // SAFETY: We hold the read lock by the type invariants. + unsafe { bindings::mmap_read_unlock(self.mm.as_raw()) }; + } +} From patchwork Wed Jan 15 13:35:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13940445 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0139C02180 for ; Wed, 15 Jan 2025 13:36:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4C43B6B0083; Wed, 15 Jan 2025 08:36:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 44D966B0088; Wed, 15 Jan 2025 08:36:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2790C280001; Wed, 15 Jan 2025 08:36:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id F2C0D6B0083 for ; Wed, 15 Jan 2025 08:36:12 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A32F8C1332 for ; Wed, 15 Jan 2025 13:36:12 +0000 (UTC) X-FDA: 83009784984.22.7166BB7 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) by imf15.hostedemail.com (Postfix) with ESMTP id 81C8FA0006 for ; Wed, 15 Jan 2025 13:36:10 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=clCBJjnf; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of 3ybmHZwkKCJ48JGACPWFJEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--aliceryhl.bounces.google.com designates 209.85.221.74 as permitted sender) smtp.mailfrom=3ybmHZwkKCJ48JGACPWFJEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--aliceryhl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736948170; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ya/fxaSnf1SEqFKMY4TDHESa1dZdTGioTOn3JAAXwI4=; b=YVF5V2a3qiVZRoJXDkhlqWBQz7DxW0LL1+j8VTbvPD1bd1HPZQcjnQamMVnqfIkC9sojCm 2XDKWivmMg0zKiGnjqOh8JbuvsWi2q85Xjwu7mC/bXc1lDMYvLAkrNRknZYNjhY2+jXAsE iMq604+NQdfzMvtwxtqGy7Z7OvEJNFM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736948170; a=rsa-sha256; cv=none; b=VjTBu91A9j3Bm0xc0UUYRfPtvak77wON2vOvbcuCztis8sRepqlEBZROJ55k/5l9Oiqdds 6PhkE8tL+Hc28Pxd+zGwPxpAf4Myj9FxTTr/JMxf0eO8ZnG4sa7f1vMjfI+MXKvhD9XWOj B3XIz4P6niHCxqs+SaOa70/VFrlYamY= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=clCBJjnf; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of 3ybmHZwkKCJ48JGACPWFJEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--aliceryhl.bounces.google.com designates 209.85.221.74 as permitted sender) smtp.mailfrom=3ybmHZwkKCJ48JGACPWFJEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--aliceryhl.bounces.google.com Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-3862e986d17so2637149f8f.3 for ; Wed, 15 Jan 2025 05:36:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736948169; x=1737552969; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Ya/fxaSnf1SEqFKMY4TDHESa1dZdTGioTOn3JAAXwI4=; b=clCBJjnfEjX7MR8tJNi2H4nx683kBYM4RxyjWb7xW/dl8B2wcRwmcZRx30SeVKFhaV VTcPL2LeQrywsG7N/3s28NdfQXDW3upFzw96PHDs7kDX+ie5LCvGzciVGPG+vZuhRCpc fbxkkGzTX1zXNM78kmccLFLSkC8wy6Bkaa9rCfs9gFn7f57A0yOB7up+26ip03HNLGck 29Cv+oGFDbV6KpRi4DUNCq6Zg1y9ytDRUCxBkZd4TQQnfq2kWM4u2OH+3ueHFh2UzaTb awNtz4fDqmYFJkdvf304/lVPdsKdidiRS0zDpDRYte5VrmhUAIXe68DfBZHNR6aNtRDV Avrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736948169; x=1737552969; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Ya/fxaSnf1SEqFKMY4TDHESa1dZdTGioTOn3JAAXwI4=; b=b77wZsj2vPaSRuSp0DZ0bZNXGZqVa/gkFChS2mHg+9N41SMvy47jP61eLGGbyrTz3Y 6fT5h3lVbNltPY1xP+0L26gwvT/RP7kbJWoJNUtPVcCRKa5hANF9XufP1nJUxOjl90Tn ABzrfgZ6Fx4gB41jOpHIw6KaT8QLaqPen9yeC+YwVPHZvuMJv24cMp7nQY6xEIngotaN qjA1Zue5GKi3mxYee8DFrs1Dl2dSSOW6cgFz+t0ZkLUoqeegtxDUvSFSbZwa08HRQvLx gdNTwuuUlLgA8Q5M9OSWM+bEKMSUP5qETEve3lBVGXs6h544afAGFVd3mXOcqsE385Jo HNOw== X-Forwarded-Encrypted: i=1; AJvYcCX67Gfzl7Poq4UQVAS5e1ktn6ATmjGBS+VEq51mycsYJDxK1qHDe9FfUcq03fAQq0JXWXmPUcX47Q==@kvack.org X-Gm-Message-State: AOJu0YywVgxdmp0nU1mHBAYih0RqLe7k0OoCgzga+csig9szbiSH8lGx fP9v+7qspu3bCQsuk5cVEH6rapwphHBGIkDfISHgno27CRtnWlIhQ//fSqHyzbmYNNZGjvWlcIL +AdaQJ94jmyixzw== X-Google-Smtp-Source: AGHT+IEEOUuyOBCf8IcyxS1o5uyy5iTIdNXcjMWrtrUVHUECJs06QuKQ9EcQ4AA8mLcrc83eTJVAQLiN4y8LWyQ= X-Received: from wmbez6.prod.google.com ([2002:a05:600c:83c6:b0:434:fe2b:fea7]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:1acf:b0:385:e2c4:1f8d with SMTP id ffacd0b85a97d-38a87303f90mr22556236f8f.19.1736948169055; Wed, 15 Jan 2025 05:36:09 -0800 (PST) Date: Wed, 15 Jan 2025 13:35:05 +0000 In-Reply-To: <20250115-vma-v12-0-375099ae017a@google.com> Mime-Version: 1.0 References: <20250115-vma-v12-0-375099ae017a@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=11748; i=aliceryhl@google.com; h=from:subject:message-id; bh=QOlRN108XsMWfGHd4EUDmGqdXLByOcXjEpSpb+5Nkxw=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnh7m9WvlRbZQObnak5+wtq7CgBZq7hLM/eCfTF lPZEeBo+IWJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ4e5vQAKCRAEWL7uWMY5 Rm6yD/wP0UVoOG6bmiLYU6Qqhm25x+hZZB6dLvh4LLNZv7LLVM/H3ISW6cX7v8OttgZnVscdXBZ FJczNPPcDy0+wTAt8TScAmi1hE6+c9c4jn0dxHD6uAmwkK4LM1TmHzvZ9Pof6BsAICdqdLO0OtX Ahp+0oMGUjXonkRgsKkxdBb01kIKLJAYbn7F5dU1lcXci7Kigjp17uSrp0zUvLaZQpBATnOscW7 Y0LAEpfiWW/VWokrxjJ7SghmqmTmQb1D+muAUO862QlkMwazGG1VwMHrZqvMiXgi3xA4x3YTDbF iZCUjoG54SpHUs/HEYMBRqFf4Q6pWVBgHwVPlc+9OliIEapK1l+pCfDaRnaCBSCP6oNo4Mejv27 fS0AwXvap2o7b4crb+/HfxWbmQugSDmmeHvGWSU1de0Y+0rBiPx9weLukK6E3MrnJ7bJv+QeVUx +T9f/xdj8GcqK66LTI/YtjWBSEmBQoF0pQ4l5IJLZt+vN7X6y8dJxrNLvhQUe2M2idOSbTuoxSJ lIPgS9Rd5Q1y8VrIpl0tx34H8y0E6CN9Pk/Ks5ZkVT6+E9nBYJhNiYKOMhB9j2JDPzyWtLbON/q tH/fN1d/Dfni4vUv6L3OYcwjEBncMZr03OtqdKlVbSFF9OQVGDdZLxJSZCjrsKqNgCEiNP3qnUM O8kV99e5k87/X0g== X-Mailer: b4 0.13.0 Message-ID: <20250115-vma-v12-2-375099ae017a@google.com> Subject: [PATCH v12 2/8] mm: rust: add vm_area_struct methods that require read access From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Stat-Signature: xpu9pzy9rnq5n94qrh1n515usotc9ipz X-Rspamd-Queue-Id: 81C8FA0006 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1736948170-596971 X-HE-Meta: U2FsdGVkX19XhcNE8t9nHuXnTGtbpoxW3TEJ7jwGQrI3VimoTm7tHFOCTs0DQxTi8ru8V+kFcEITQd4w5S2jNDjpggwiAMgtCXeEwb4MaRAKuqZ5oKZua/s4A85btUMWv0F8qmh9aRchpbH+nEePGqtBDkQdEzMOl8h+hSPwJky0N7cDzJJH19mQ2gHiW1p/VCQCuFgfEvzuKZn4tQ3uh/gVtp7IjG6I4+cIKAWH3I4LcU1JJxZ/VkZzgswQUgJaovEY27MQCC3EJ4YLcyWXJI2ZDJNx/AvkF+Vg1uEv2jlcN9E/fhUY2U3Oa9O5JRT20zns/m0WtHrEhgo9nfA28iWOyxjcc+2ERo+TdnBrMyGQQQ6xXvGCIZx+vgKnLFLN+rPW5d658jZLDg1cvE/p17WLkwtZtoO0vqmpDGr8TAybApxwlpG0S8QaVU0hEQDiGZultJfQIn0hwjMF8RASZBY753Yqp4rxV03GRTM2Svv1TYbIwhW/c5oY1YU3crNFdDvs5egBS3OpPIP4PnITu7l4DAkGR2mauquQsTqQgdyadXpsvbI++WZ7je3TDLjapLgywyF/cyHt2nSLFWb2VhNRKKeX1+s3FMkmjLX+IGzyN33VEA3KgWZldEEsi79ouafrgByRwQs2cPP98iCxCxEPRVaDfoJkyIY97ZobfbhRn6hAVdK0ePKIFrsZqKtvRE0JjUkWmRCumj0AmAbMojlTTtswgul7X/Tmxalw+CeQk0gSpoy/vc1PNBCQandeYzlH5e0NHgu5RYBVR1ndN7tp5nq+nMHsHEffyYTH3RRZGhLg4QT3yxtooCZLfbtcuMMoULWhafP5/KBg7yzMNJje+5xiGV697xmG34YXj/Ozi+SnrVgnc2Eyv4QyvmgmvQsaSYkME8UkE8qT48mkmfRkn10IDN71O7Zf7SFshPPUI5WB9WWAl7OlyyyRABUIEvNqUQ3FqSBle5+3cfs yXl9Bfv+ 6pNaPAnbPmRoxHzIj2yl+uBHeoz/XBQbADbQvDf7ktouczo5AqvqxadTxrshPqZXCNyTdcrrO6xivUWnf2HqzkQ/Ri4Tr9maQhj5HSbcYmyZUBxzmRWbjf59R010P4DCdVPSk0SRHobLywRP/7fv+9XjzpH26JWAXkyvBtD7j4l3cgdjAHPvGtD1c23bXx15IMm7Nkm4HNz4UDIgQBHt/sRugKv+TVPKZFmNeRJey9k95wEnPq1LRVjNK047IPCon4uDtlqi1pjwxF5ikJZhNv1aY2WX+Is+Sxg88FAg7nPVfGRiRW04CpYXX1OZ1+jgKfh/5FWNenQTR9HbtxpHZkoWGoXSEJGumIeJoo39V6/jGUyUkSrJ6wHjuP4M/yI51SpLznyc3YvqfXBI5WoIffg2uAaX0cCXvFC7svNHlrWXzX5ZOd6zKsu3F6K//zqSpSD6M4tjAhjeZEMWfPQGmCBxHPEPesRgeSEcfpi9J1Xl353Da/I9/sNz3L4gq9tEEHg2wteb1iYgqCyXg9S6nfjprPodoDTUFeLT0NobaimSmKUpL8BISfG60dSdEVqxpL4Vf2rFBmTiXVkWI5Y1a6otJpGmew26UNBJEIbDS4eGrQPLafWwrpQkSOq1TRtxH5v6SHc+KngVvY6B2EnbOXu/76SnynmN8v3xb+hYKEY/8jZg0cN1vAzTWj/pSDx0g1ueVHKRDHDIWzRw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.177263, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This adds a type called VmAreaRef which is used when referencing a vma that you have read access to. Here, read access means that you hold either the mmap read lock or the vma read lock (or stronger). Additionally, a vma_lookup method is added to the mmap read guard, which enables you to obtain a &VmAreaRef in safe Rust code. This patch only provides a way to lock the mmap read lock, but a follow-up patch also provides a way to just lock the vma read lock. Acked-by: Lorenzo Stoakes (for mm bits) Reviewed-by: Jann Horn Signed-off-by: Alice Ryhl --- rust/helpers/mm.c | 6 ++ rust/kernel/mm.rs | 21 +++++ rust/kernel/mm/virt.rs | 215 +++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 242 insertions(+) diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c index 7201747a5d31..7b72eb065a3e 100644 --- a/rust/helpers/mm.c +++ b/rust/helpers/mm.c @@ -37,3 +37,9 @@ void rust_helper_mmap_read_unlock(struct mm_struct *mm) { mmap_read_unlock(mm); } + +struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm, + unsigned long addr) +{ + return vma_lookup(mm, addr); +} diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs index 2fb5f440af60..ee1a062ec7d7 100644 --- a/rust/kernel/mm.rs +++ b/rust/kernel/mm.rs @@ -17,6 +17,8 @@ }; use core::{ops::Deref, ptr::NonNull}; +pub mod virt; + /// A wrapper for the kernel's `struct mm_struct`. /// /// This represents the address space of a userspace process, so each process has one `Mm` @@ -200,6 +202,25 @@ pub struct MmapReadGuard<'a> { _nts: NotThreadSafe, } +impl<'a> MmapReadGuard<'a> { + /// Look up a vma at the given address. + #[inline] + pub fn vma_lookup(&self, vma_addr: usize) -> Option<&virt::VmAreaRef> { + // SAFETY: We hold a reference to the mm, so the pointer must be valid. Any value is okay + // for `vma_addr`. + let vma = unsafe { bindings::vma_lookup(self.mm.as_raw(), vma_addr as _) }; + + if vma.is_null() { + None + } else { + // SAFETY: We just checked that a vma was found, so the pointer is valid. Furthermore, + // the returned area will borrow from this read lock guard, so it can only be used + // while the mmap read lock is still held. + unsafe { Some(virt::VmAreaRef::from_raw(vma)) } + } + } +} + impl Drop for MmapReadGuard<'_> { #[inline] fn drop(&mut self) { diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs new file mode 100644 index 000000000000..2c7de0460e0a --- /dev/null +++ b/rust/kernel/mm/virt.rs @@ -0,0 +1,215 @@ +// SPDX-License-Identifier: GPL-2.0 + +// Copyright (C) 2024 Google LLC. + +//! Virtual memory. +//! +//! This module deals with managing a single VMA in the address space of a userspace process. Each +//! VMA corresponds to a region of memory that the userspace process can access, and the VMA lets +//! you control what happens when userspace reads or writes to that region of memory. +//! +//! The module has several different Rust types that all correspond to the C type called +//! `vm_area_struct`. The different structs represent what kind of access you have to the VMA, e.g. +//! [`VmAreaRef`] is used when you hold the mmap or vma read lock. Using the appropriate struct +//! ensures that you can't, for example, accidentally call a function that requires holding the +//! write lock when you only hold the read lock. + +use crate::{bindings, mm::MmWithUser, types::Opaque}; + +/// A wrapper for the kernel's `struct vm_area_struct` with read access. +/// +/// It represents an area of virtual memory. +/// +/// # Invariants +/// +/// The caller must hold the mmap read lock or the vma read lock. +#[repr(transparent)] +pub struct VmAreaRef { + vma: Opaque, +} + +// Methods you can call when holding the mmap or vma read lock (or stronger). They must be usable +// no matter what the vma flags are. +impl VmAreaRef { + /// Access a virtual memory area given a raw pointer. + /// + /// # Safety + /// + /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap or vma + /// read lock (or stronger) is held for at least the duration of 'a. + #[inline] + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self { + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a. + unsafe { &*vma.cast() } + } + + /// Returns a raw pointer to this area. + #[inline] + pub fn as_ptr(&self) -> *mut bindings::vm_area_struct { + self.vma.get() + } + + /// Access the underlying `mm_struct`. + #[inline] + pub fn mm(&self) -> &MmWithUser { + // SAFETY: By the type invariants, this `vm_area_struct` is valid and we hold the mmap/vma + // read lock or stronger. This implies that the underlying mm has a non-zero value of + // `mm_users`. + unsafe { MmWithUser::from_raw((*self.as_ptr()).vm_mm) } + } + + /// Returns the flags associated with the virtual memory area. + /// + /// The possible flags are a combination of the constants in [`flags`]. + #[inline] + pub fn flags(&self) -> vm_flags_t { + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this + // access is not a data race. + unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags as _ } + } + + /// Returns the (inclusive) start address of the virtual memory area. + #[inline] + pub fn start(&self) -> usize { + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this + // access is not a data race. + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_start as _ } + } + + /// Returns the (exclusive) end address of the virtual memory area. + #[inline] + pub fn end(&self) -> usize { + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this + // access is not a data race. + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_end as _ } + } + + /// Zap pages in the given page range. + /// + /// This clears page table mappings for the range at the leaf level, leaving all other page + /// tables intact, and freeing any memory referenced by the VMA in this range. That is, + /// anonymous memory is completely freed, file-backed memory has its reference count on page + /// cache folio's dropped, any dirty data will still be written back to disk as usual. + /// + /// It may seem odd that we clear at the leaf level, this is however a product of the page + /// table structure used to map physical memory into a virtual address space - each virtual + /// address actually consists of a bitmap of array indices into page tables, which form a + /// hierarchical page table level structure. + /// + /// As a result, each page table level maps a multiple of page table levels below, and thus + /// span ever increasing ranges of pages. At the leaf or PTE level, we map the actual physical + /// memory. + /// + /// It is here where a zap operates, as it the only place we can be certain of clearing without + /// impacting any other virtual mappings. It is an implementation detail as to whether the + /// kernel goes further in freeing unused page tables, but for the purposes of this operation + /// we must only assume that the leaf level is cleared. + #[inline] + pub fn zap_page_range_single(&self, address: usize, size: usize) { + let (end, did_overflow) = address.overflowing_add(size); + if did_overflow || address < self.start() || self.end() < end { + // TODO: call WARN_ONCE once Rust version of it is added + return; + } + + // SAFETY: By the type invariants, the caller has read access to this VMA, which is + // sufficient for this method call. This method has no requirements on the vma flags. The + // address range is checked to be within the vma. + unsafe { + bindings::zap_page_range_single( + self.as_ptr(), + address as _, + size as _, + core::ptr::null_mut(), + ) + }; + } +} + +/// The integer type used for vma flags. +#[doc(inline)] +pub use bindings::vm_flags_t; + +/// All possible flags for [`VmAreaRef`]. +pub mod flags { + use super::vm_flags_t; + use crate::bindings; + + /// No flags are set. + pub const NONE: vm_flags_t = bindings::VM_NONE as _; + + /// Mapping allows reads. + pub const READ: vm_flags_t = bindings::VM_READ as _; + + /// Mapping allows writes. + pub const WRITE: vm_flags_t = bindings::VM_WRITE as _; + + /// Mapping allows execution. + pub const EXEC: vm_flags_t = bindings::VM_EXEC as _; + + /// Mapping is shared. + pub const SHARED: vm_flags_t = bindings::VM_SHARED as _; + + /// Mapping may be updated to allow reads. + pub const MAYREAD: vm_flags_t = bindings::VM_MAYREAD as _; + + /// Mapping may be updated to allow writes. + pub const MAYWRITE: vm_flags_t = bindings::VM_MAYWRITE as _; + + /// Mapping may be updated to allow execution. + pub const MAYEXEC: vm_flags_t = bindings::VM_MAYEXEC as _; + + /// Mapping may be updated to be shared. + pub const MAYSHARE: vm_flags_t = bindings::VM_MAYSHARE as _; + + /// Page-ranges managed without `struct page`, just pure PFN. + pub const PFNMAP: vm_flags_t = bindings::VM_PFNMAP as _; + + /// Memory mapped I/O or similar. + pub const IO: vm_flags_t = bindings::VM_IO as _; + + /// Do not copy this vma on fork. + pub const DONTCOPY: vm_flags_t = bindings::VM_DONTCOPY as _; + + /// Cannot expand with mremap(). + pub const DONTEXPAND: vm_flags_t = bindings::VM_DONTEXPAND as _; + + /// Lock the pages covered when they are faulted in. + pub const LOCKONFAULT: vm_flags_t = bindings::VM_LOCKONFAULT as _; + + /// Is a VM accounted object. + pub const ACCOUNT: vm_flags_t = bindings::VM_ACCOUNT as _; + + /// Should the VM suppress accounting. + pub const NORESERVE: vm_flags_t = bindings::VM_NORESERVE as _; + + /// Huge TLB Page VM. + pub const HUGETLB: vm_flags_t = bindings::VM_HUGETLB as _; + + /// Synchronous page faults. (DAX-specific) + pub const SYNC: vm_flags_t = bindings::VM_SYNC as _; + + /// Architecture-specific flag. + pub const ARCH_1: vm_flags_t = bindings::VM_ARCH_1 as _; + + /// Wipe VMA contents in child on fork. + pub const WIPEONFORK: vm_flags_t = bindings::VM_WIPEONFORK as _; + + /// Do not include in the core dump. + pub const DONTDUMP: vm_flags_t = bindings::VM_DONTDUMP as _; + + /// Not soft dirty clean area. + pub const SOFTDIRTY: vm_flags_t = bindings::VM_SOFTDIRTY as _; + + /// Can contain `struct page` and pure PFN pages. + pub const MIXEDMAP: vm_flags_t = bindings::VM_MIXEDMAP as _; + + /// MADV_HUGEPAGE marked this vma. + pub const HUGEPAGE: vm_flags_t = bindings::VM_HUGEPAGE as _; + + /// MADV_NOHUGEPAGE marked this vma. + pub const NOHUGEPAGE: vm_flags_t = bindings::VM_NOHUGEPAGE as _; + + /// KSM may merge identical pages. + pub const MERGEABLE: vm_flags_t = bindings::VM_MERGEABLE as _; +} From patchwork Wed Jan 15 13:35:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13940446 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58C08C02185 for ; Wed, 15 Jan 2025 13:36:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 11DDB6B0089; Wed, 15 Jan 2025 08:36:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 03022280002; Wed, 15 Jan 2025 08:36:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D9BC5280001; Wed, 15 Jan 2025 08:36:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id B2F4F6B0089 for ; Wed, 15 Jan 2025 08:36:14 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 740FAB0293 for ; Wed, 15 Jan 2025 13:36:14 +0000 (UTC) X-FDA: 83009785068.20.78C096A Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf24.hostedemail.com (Postfix) with ESMTP id 7D023180005 for ; Wed, 15 Jan 2025 13:36:12 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=TRmfW6gb; spf=pass (imf24.hostedemail.com: domain of 3y7mHZwkKCKAALICERYHLGOOGLE.COMLINUX-MMKVACK.ORG@flex--aliceryhl.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3y7mHZwkKCKAALICERYHLGOOGLE.COMLINUX-MMKVACK.ORG@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736948172; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iAB5+MkDlqluj6B7QpzdYLJZpgBLqkqOo5ngR0sMgMc=; b=EELYvDprLTKt63h+34tL047qiheSEOmTfdpcwxLdPwR/jVsrhZv/D77tb22zcsnMXEbX6T Z2t72XQLjpKUhiI9Z47aarqJjVebAhSHDiJY4qao10z5M5NaSjr+xJnsIhpX0yK2sU6F7M v80B/dE33daQLEvhFoh/FOdXScGc2zk= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=TRmfW6gb; spf=pass (imf24.hostedemail.com: domain of 3y7mHZwkKCKAALICERYHLGOOGLE.COMLINUX-MMKVACK.ORG@flex--aliceryhl.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3y7mHZwkKCKAALICERYHLGOOGLE.COMLINUX-MMKVACK.ORG@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736948172; a=rsa-sha256; cv=none; b=4kAlgI7bt2KMAi8wzalj+frL0SGvo2R3OR07pUFR1hGjmqI23PC99fOxJE4gah8ppCtVRm YxC34D7r5GciiJEqOmSHkJrKpU8DKktAZ19AefEzWvSCkJzwv/Wosbn5wHALJZuDm9GNFt Cgz4igR4Vqovmn6lrK06q1Ni5DQGh9s= Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-4362f893bfaso36067155e9.1 for ; Wed, 15 Jan 2025 05:36:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736948171; x=1737552971; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=iAB5+MkDlqluj6B7QpzdYLJZpgBLqkqOo5ngR0sMgMc=; b=TRmfW6gbI5S9OOCT0Y2SobG9IlRXfSQngDjd417TzouQmXFS3B2/qIj3uqc56SArGz /Bl2zWNSLZ5Of0wh0VtLcnp4wX7L7bVzuC+eFWbBvvcNiTdy0aXqTeneebwCmUx1a/Yh PCLTDc1/N0bspFmtjr2/0p7tewkegrLP83jmCB1f8EJ98STyn9K4L3tnbzNSOD2YGDUU DFW/WaqsA9uHlrXnhbu3gmA3hy2+wzFBZR4r1rDt7+H9yLJ+SwbW4JlIOWPLmMDcGTG6 JyBDsGdJ9jyWduzbngjzm4MOqPhXxXguKtOmnxK5yPoXaVBlPIfXECI/TRacn7pqoR4Z SSDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736948171; x=1737552971; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=iAB5+MkDlqluj6B7QpzdYLJZpgBLqkqOo5ngR0sMgMc=; b=qDII745G7ir8nFhNfmonIgLPyA7ByTop/ltwNuGH2mRWqAt2vS7lGum9z3NBj+rp+9 EILnXS6XFbzfKyUfyfYehJ/5gSAutKJxlTIhdqyuelrF3B7bdRx7G4jffZz01sMQTptt aME2w+fkeFH48C7vs8O+GvIwDUha4BC5JVN+fg6CNgx4cUUXY07V/WWlI43D5Ybgk7Np bQjRsTqiF0BvMeqrJlSyBJqYb0zV9JoeWvXQH+MU/cXvaXfSnkEgvReLNl8q6Hj/VQvx LElpwu1gcJiGJAHQJJr995Z0Z+bQOgNgbLnYmLj6TsAM63pD5EFeQTiE0WysJLeKMdHz vjLA== X-Forwarded-Encrypted: i=1; AJvYcCVW0ul3rf3xgGA/24XJ2Lb3B3wnZNJf3SR0YeQzfV/XscBvKzVST/OvVhDajvsHhY6Pm8Zj33CFHg==@kvack.org X-Gm-Message-State: AOJu0Yym2ISvio0Bi93+x+Xyx+1DaiZ2O9NteonZHR9ZVTZvy5GQIN3G lXFlcMBcOikvoQtgcW7wWq+UXZgZwPZy7Kvdt0CkA4Rh30PaXVl0HHsPeO4napgWsjAl4wXUg/C hhXSdSobYdbRbVw== X-Google-Smtp-Source: AGHT+IH4/tFhizC0Tlbso0pfJWClZ8vsinv7fWPPpakyMd1YYPpuWimTTutI9DYbU8xxDHrbDqgJ8v0xT0gjcNQ= X-Received: from wmbhj18.prod.google.com ([2002:a05:600c:5292:b0:434:fd4d:ffad]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:314e:b0:434:a5bc:70fc with SMTP id 5b1f17b1804b1-436e269aad7mr260754245e9.8.1736948171131; Wed, 15 Jan 2025 05:36:11 -0800 (PST) Date: Wed, 15 Jan 2025 13:35:06 +0000 In-Reply-To: <20250115-vma-v12-0-375099ae017a@google.com> Mime-Version: 1.0 References: <20250115-vma-v12-0-375099ae017a@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=4166; i=aliceryhl@google.com; h=from:subject:message-id; bh=DkijNTCWjzL+GZ9A/ngSiwPnkdD9yqoZbhkXBZ1FzqQ=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnh7m+n+HRp6TVitWGzxTZmNyfMlBYn2nHJh85c gF39+HKacqJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ4e5vgAKCRAEWL7uWMY5 RkLSD/4hAr6ppSvTg25k4E1epqUVJqhwKwShK0kGDC2daG1PTdOf7HHIv2izh6j3qOiJU5pgh3e JBpHueh/ss+r9ARsw5Vp0y8VmJRhMphbTUgNJe6+f/TzTmBbwi5FWj36eKetwmB+6x32ApYd3bW b9zGR5ekfUIHCUKnn4ybbKv/+eO4qkmegPEzNpFE6ZemT/1bvOscQtuiU+1zEXtE6oMK6MhHrWM 0ntaU/0JSoLHdsCRv2kkmLI741cNl9wpplYF17izNbhn2hbCXHOtqpuzPXyfbqr9Frte5i6Ea8p FJTF4wJ3KRxwb52DoikPJXCzUprqFedDHjaPo9NB9eXQcXDTktrviZs74nlCx2IVINhEQW9NFqT lxiyxB059onff3iZax0SYVxFr+jH4w/YdiPd4rrpf0s3eMtOPQ3rYVPOK8dA5JP8uxiYhOJHFO5 XjQeCofzoPVvwKq5HLsa+/pMaIXrSQoDJ0SmdYtAAsQKCpI4XUrrXS9PVDisEsGcL+r9i3CrlCz jZwa2hoidoGaAip0/tCOyZPxeW9uO9X220T4m1wQeIzUS06ggzIWX3dbrSd7HpHg2/0U3OyZ4Zj l/0rq57Zp7TS9fPY2JWKtz2TEpyegxfM/AOuhYGKPzf6OTraPQnjWNhd8ncJL6WAWj0Zai8jcPC 0zsvqbfT7/0ElCg== X-Mailer: b4 0.13.0 Message-ID: <20250115-vma-v12-3-375099ae017a@google.com> Subject: [PATCH v12 3/8] mm: rust: add vm_insert_page From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Rspamd-Queue-Id: 7D023180005 X-Stat-Signature: 43pah81yus9rq63h8mjdzdptz4nj4yg4 X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1736948172-447917 X-HE-Meta: U2FsdGVkX19EPqxviGfPa8TekME6HUvu0zZSkyEvZqVY4SfOVrSvj1GJ5x1n6lVAZCRpLmQCKWpRj43GYsUWClgGgHTLZc6QXL1EGNFcDDohobUwuAugNFr5pa/WJ6d/QO/LRrisYEfIBFTHCPd7grvRGMJDTvA38zoIziSPd5uHUdb7O6EHPjqrepGT4ThFosIRy6ynqeRNYeTv6ZoGDLzDUxIBWCKP/3k29kfxJh+YIxGXYa6AuZJlgOWvTUP7AOS6qsqwghnQJBzkinmg8osZmmxQjn/X8h63ZrVY4aHT3Qa2rHEXoCVDPpBWH12QIbTREhyeauEGH8anDD08bcHysK+YM6bn3ceLVZ2r73JTRR55/pmrXdedWURYFQ5nTySJZrgPPwJzo78ziMpCxprawrMSovWhdGCywIBRM6EXkzE4HStCrgjRH74RT2fYDfeTiUYVHN2wPhlqgsW41ioAb1UNy0wdizB+n8pAhz5d9erWSI3lyk8EM3Oe4m4u/k6FPsWK+GwBnUFTzCO1hHSJls0APjQ+mU2fSTXOZEfPvNCIklH8qc0IHLFdgjRv99TC006c944poCAzI9DADJ6J0382+GE9KZHrmQqRpU2tqpe6kibBK73ZBI4gZIQbVejAp7X1FaWTDvuSHwMDIbzDlDLFaO1LjyMpYae8BGkQh3HZL0h932JD6iKbfXZCHelfVC9/xuY55/wdgaGSI9R33dHO3EpychuxtFoAWD7teaaPOlmprvpuGIIa4qHpTpexioyk1av8lFDlJzuaESfd20laT4aYYjyQJ3onqzcMUSNNPN9bhp7xnuq4zOMjxoaNT53O8LfbD3SPtNL1tAlccRyww/2/B5ODRAX6DSE0PucbEhyTUiZcajASGBZkt/9r0iuge9xz959TR6nFII8ERIabVsJwDz5TEN9kllGwC3WOLavWfzM/6B+9VUhWojM09rVZ3HBIvm5G6GI bJx4SQgh S3BKhxROXxu+fMzLU/PEoE8WcyNJLm5oubYOmCkuwh7pTJgsQBq8IchG9KBzzlZLLaSDoN+cFl0ZG715Z6zQgwQnl91jVrP1kwJ0ETIjkHZARQhFwjPCdaRwZ+GbvQTto8qZsssAy7RY1v5H9YMwjhjkpaBeTXq+7KcovdIAXlsBGlngCgWN5qUwCqVtNHy+NpqsQvfABSn8nh/Z62gE1KyDjYXM8Ob98mAbBeFW4Gq0gnqcavQZewNeg2fdnVlXLsf2rbHYtkT2Si8qQulSq+/J4JeR0lFq48ys+kidhKsioR2t0d70mwJVe1OJyc23Nw+49jsWfF72gSwdccrqXdZqIxox+k+D985ZysCpDyxGlHvtj6otWfoFAKmU+y92RcMm99gvHVwz3JGn3Lw21bXIjXf5oyeN5S+1zYr+ekG5FP/LqXub19Z4eOuVYsF2J+FTAzVS1QdJe5kUjp+cm2fvRzaMes48ooX4YwX12m3ZEMGomw2/d1nYBM2mBs/c05aWFhrdl4Z4rRgTcmHLeuHZi3MUsx+G69PWxjE8onjhFo51Km9AiQiRXp1viNGFRMbXHr8G9xXTC2QFmXyRqB8JKPDw3c6wynCJfjeBK+ULW4od00AKJPwzY2lkpsoZS/wV3Ynke8Oxqr16BUSHwnslykKbrABIEDybn7AiVkn40R7hffdKQcUmOjnFy6RZy3Pl0ZSKxGqmLkU4= X-Bogosity: Unsure, tests=bogofilter, spamicity=0.493972, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The vm_insert_page method is only usable on vmas with the VM_MIXEDMAP flag, so we introduce a new type to keep track of such vmas. The approach used in this patch assumes that we will not need to encode many flag combinations in the type. I don't think we need to encode more than VM_MIXEDMAP and VM_PFNMAP as things are now. However, if that becomes necessary, using generic parameters in a single type would scale better as the number of flags increases. Acked-by: Lorenzo Stoakes (for mm bits) Signed-off-by: Alice Ryhl --- rust/kernel/mm/virt.rs | 79 +++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 78 insertions(+), 1 deletion(-) diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs index 2c7de0460e0a..ab89a526d3e4 100644 --- a/rust/kernel/mm/virt.rs +++ b/rust/kernel/mm/virt.rs @@ -14,7 +14,15 @@ //! ensures that you can't, for example, accidentally call a function that requires holding the //! write lock when you only hold the read lock. -use crate::{bindings, mm::MmWithUser, types::Opaque}; +use crate::{ + bindings, + error::{to_result, Result}, + mm::MmWithUser, + page::Page, + types::Opaque, +}; + +use core::ops::Deref; /// A wrapper for the kernel's `struct vm_area_struct` with read access. /// @@ -124,6 +132,75 @@ pub fn zap_page_range_single(&self, address: usize, size: usize) { ) }; } + + /// If the [`VM_MIXEDMAP`] flag is set, returns a [`VmAreaMixedMap`] to this VMA, otherwise + /// returns `None`. + /// + /// This can be used to access methods that require [`VM_MIXEDMAP`] to be set. + /// + /// [`VM_MIXEDMAP`]: flags::MIXEDMAP + #[inline] + pub fn as_mixedmap_vma(&self) -> Option<&VmAreaMixedMap> { + if self.flags() & flags::MIXEDMAP != 0 { + // SAFETY: We just checked that `VM_MIXEDMAP` is set. All other requirements are + // satisfied by the type invariants of `VmAreaRef`. + Some(unsafe { VmAreaMixedMap::from_raw(self.as_ptr()) }) + } else { + None + } + } +} + +/// A wrapper for the kernel's `struct vm_area_struct` with read access and [`VM_MIXEDMAP`] set. +/// +/// It represents an area of virtual memory. +/// +/// This struct is identical to [`VmAreaRef`] except that it must only be used when the +/// [`VM_MIXEDMAP`] flag is set on the vma. +/// +/// # Invariants +/// +/// The caller must hold the mmap read lock or the vma read lock. The `VM_MIXEDMAP` flag must be +/// set. +/// +/// [`VM_MIXEDMAP`]: flags::MIXEDMAP +#[repr(transparent)] +pub struct VmAreaMixedMap { + vma: VmAreaRef, +} + +// Make all `VmAreaRef` methods available on `VmAreaMixedMap`. +impl Deref for VmAreaMixedMap { + type Target = VmAreaRef; + + #[inline] + fn deref(&self) -> &VmAreaRef { + &self.vma + } +} + +impl VmAreaMixedMap { + /// Access a virtual memory area given a raw pointer. + /// + /// # Safety + /// + /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap read lock + /// (or stronger) is held for at least the duration of 'a. The `VM_MIXEDMAP` flag must be set. + #[inline] + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self { + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a. + unsafe { &*vma.cast() } + } + + /// Maps a single page at the given address within the virtual memory area. + /// + /// This operation does not take ownership of the page. + #[inline] + pub fn vm_insert_page(&self, address: usize, page: &Page) -> Result { + // SAFETY: By the type invariant of `Self` caller has read access and has verified that + // `VM_MIXEDMAP` is set. By invariant on `Page` the page has order 0. + to_result(unsafe { bindings::vm_insert_page(self.as_ptr(), address as _, page.as_ptr()) }) + } } /// The integer type used for vma flags. From patchwork Wed Jan 15 13:35:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13940447 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43C8CC02180 for ; Wed, 15 Jan 2025 13:36:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9CF396B008C; Wed, 15 Jan 2025 08:36:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 92EFE280001; Wed, 15 Jan 2025 08:36:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7AA466B0093; Wed, 15 Jan 2025 08:36:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 5005C6B008C for ; Wed, 15 Jan 2025 08:36:17 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id C3967812C4 for ; Wed, 15 Jan 2025 13:36:16 +0000 (UTC) X-FDA: 83009785152.19.488ABE8 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) by imf14.hostedemail.com (Postfix) with ESMTP id D0A10100016 for ; Wed, 15 Jan 2025 13:36:14 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="HMQp/Esg"; spf=pass (imf14.hostedemail.com: domain of 3zbmHZwkKCKICNKEGTaJNIQQING.EQONKPWZ-OOMXCEM.QTI@flex--aliceryhl.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=3zbmHZwkKCKICNKEGTaJNIQQING.EQONKPWZ-OOMXCEM.QTI@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736948174; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WyGEiOkhmdNY/fU37LXSLgzBjmVwObw0G8SyyivcNjI=; b=u0wB80ZiV7GlVRhmoV/6GcYhLsxUVNj3rG0jZdyyoeqaviZmYS7ioRtKbDOKeNdOOak4QP T6IPu+WngIoaGy+gt/092R2vbDuOMMm6TqWq/tMivRAKMwCrph956P7e2nXsbcMybGOpwj /Ks70hO79mpvRkyvq0LJAFQZQeQ8xJA= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="HMQp/Esg"; spf=pass (imf14.hostedemail.com: domain of 3zbmHZwkKCKICNKEGTaJNIQQING.EQONKPWZ-OOMXCEM.QTI@flex--aliceryhl.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=3zbmHZwkKCKICNKEGTaJNIQQING.EQONKPWZ-OOMXCEM.QTI@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736948174; a=rsa-sha256; cv=none; b=PqwbJqeqazFEdsPISafkiI0vNvFGh2/Xh1xkD/gSHqWhGKYvPy847B8M979Vs3CgAnobgn I+fx9YjV8tgfBMjKNkfRJlcoV1y28ZankzoFPSWkW7+Cd52TUg8KW21q768T00Z4j63UHv s0RFEGhwlnFGYxwGOMTCHWN3a2u4Cgo= Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-385d80576abso4501091f8f.3 for ; Wed, 15 Jan 2025 05:36:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736948173; x=1737552973; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=WyGEiOkhmdNY/fU37LXSLgzBjmVwObw0G8SyyivcNjI=; b=HMQp/Esga7gSsVF3cD8+3swcaQBCPzCQ4ABTDS4Zr6W8Vslv55MshR4VONMoDpbp2e ImJ0DvE9ybYT/9iTQ3zQa6oDMYTDBlzT7Nb0fkrW4HkDmlNla32eaAgya5CSKcHzcvHg JXba0XrungUejtN/rDopvmgwFRwhE3TLB1BRWP4JXwKQrfjGtAYisEWQ/X4Yn1G3rvJb HanQUkaIHNeNq11fN4fq7MTuVK8iZCi6NNgV7az4ZTJyKgFF+VTAqsJZGbp8FzPxKBEe lfc/DhkheE3MDp8TltmVg6fsPd9rsIqbRyl0UgeuKe64q7e5HMW52EqoywdMy2ZhO97f x8fA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736948173; x=1737552973; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WyGEiOkhmdNY/fU37LXSLgzBjmVwObw0G8SyyivcNjI=; b=gLLAfHGOw2E+Kueet3unE93Cx6zl/7I70Hm0WPdHcqW2MRCOmqIgKMV8XNkks8QJ4I /0DHaOWdULYwP1nj2I5QTONGpn8ZeYtZALwPCwlCnipjN2TkZP5d5f6K4S2ReUGRBf33 3HOn/4+AudfySYLNqYWbqOe1AFCKfWUANmOzOcL+zYMQUsT4wt0NtE8ACOXJoUcgmz2p LppQ5tvY6SheeiGcWJM7jjkAgi2/qTWblrFRc0XXFgSFMw/786zTZypvdNw0iK8znWhZ KydwytlTmWE4CynxIpgxP11/CenM3GnNaKz/GT/eQN59TgYVDzoqsdo66YSujiNPmnBO OGEg== X-Forwarded-Encrypted: i=1; AJvYcCW8CxZFKgAnXtTwDuxBQtPF9tCf6IsyGt+YAEtsB70+xvUh9WlywWBPoMapaPsKYo28Z8bWrMDBxA==@kvack.org X-Gm-Message-State: AOJu0Yy9IcVafmzwCMo9XTqEaWA5TBNFHC0wPsCtL7/H5sXaD4XxyDlD HuOejv3VRJfDqJDCM316Lrrg95r7EjG43gbY52MuIqtcTB4xi1dEPYmJ0IKJA9wX9Nam/4+Anrf Kjmhb167JRXCEpg== X-Google-Smtp-Source: AGHT+IH6kBJKdmdDV5w19iOE843iDVldxJ9in2s0SPbV23iN1Txz+OJHqSf7ZYmSD6HRC4mZ8sg2KRgO82avk5s= X-Received: from wmbfl11.prod.google.com ([2002:a05:600c:b8b:b0:436:e755:a053]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:184e:b0:382:46ea:113f with SMTP id ffacd0b85a97d-38a872fc1b3mr27766198f8f.10.1736948173574; Wed, 15 Jan 2025 05:36:13 -0800 (PST) Date: Wed, 15 Jan 2025 13:35:07 +0000 In-Reply-To: <20250115-vma-v12-0-375099ae017a@google.com> Mime-Version: 1.0 References: <20250115-vma-v12-0-375099ae017a@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=4078; i=aliceryhl@google.com; h=from:subject:message-id; bh=fLfkFNdLYfHgKfZa0dsXjtc9votv5zCA9qLsVLeeLkE=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnh7m/l8GboGpN/qusvnLxbok1uTnWuhzUiY+SS PCNfbEvgfSJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ4e5vwAKCRAEWL7uWMY5 Rg80D/9JzLVe2s09/Ey8IChe2f9AxrZWGc5rUDwgfHi/y8u+2zvnVesQGWFxUhTaCR435B8OVxl Ytf7xmAkpx69EK9ykOXhedM7v8VjGiD4aQDdY6PVvJPSn2MqV1yR+vLN6qIFWT6yBAVpVIkP1td +w5eKL6SIUPPOy8Soh/Dc4TAsEXq+JL9JEW80gfxcGXh3wsQzWc13ckoRPk5LdcT5GCVnLiaKYN wfkm3PWmgvVayaKk2PLF1RmN3hzwu9K2dn8BVptE+M6Fr0SafY6i8Oi0VE1ofzMxxUX29xyFkhZ uQWnNU/gdc93rgpgmNxxmRznFlMnBmEtNTH4kmI6YlrR8xXpdMPx2VIXTpg9vFHzrrcRI4c9dLD TiqK5aexQj6lRpVzWPdj7Sv6HE0sfCR+Orv7MPYgn8EGY+X69KmsutvARRFIvuCesfCmmBwHwfV Gfk2+CiL4ynj6qx9lZDg5DmW+68O3r/5DMyFtqB7uevUcSZ/HPmC9i5KcWTAtvVS3uL0EmKPePb fd8Gi72oo1oQ3OjvzaS0SBeHHlVE0FWz1n+TAOa+uPSShuMSx+4ki2JShfkmUWc2kfYxZYPNTpq lsHyUCXtL46uBkw221SbwO0R/tyz+0ABzNVR7dKEpjkbktYrrQ0dYY4V32dS0IsS1gpv0UJnqli M1k0GU7PNicgsmg== X-Mailer: b4 0.13.0 Message-ID: <20250115-vma-v12-4-375099ae017a@google.com> Subject: [PATCH v12 4/8] mm: rust: add lock_vma_under_rcu From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Rspamd-Queue-Id: D0A10100016 X-Rspamd-Server: rspam12 X-Stat-Signature: zakk19nqci1hh9xa19s19otpk83b71wh X-Rspam-User: X-HE-Tag: 1736948174-163925 X-HE-Meta: U2FsdGVkX18AgNFlnv9E+k9PTlAXKdxUgpD0WCwsMW2VRpyZ0iU6yGwp6J+pXkiRyj+NJZk1y89kz3OWmJXH6+Z0FFVOTeKZqFNPsG0cHr3HCGbla9v4akX4o2epDzxk+aOs/c5KjnsvwG4S74HyxzCni1ypfAqEPwr4nyP4FfNe74CwLhRMW+kWpLMflf54twdbynIFtYA2hjghNkZjSAzl2LrKpRCkhobGg+LfEGDB7jINi/Cxr0hVovCwSvpZcY/6veWOj37BD7e8OT+Hmhxc/Tzehaj/vrsigI30hBmWSgw3RWUAI3r4FadNV7rZW38HD62pGflHhGNFHWOPVtdw/4Z4tl9NAha4QdWgjJU7x9pYc/Wix9XybtMR3R1CklwOHkp9ESbnB+58SRAODYPzPpyV8AFoe3hDO/jszXjeV0GhmECyF6QTW4fsTMfGH11LWFPAGbv2bAFkLMJptEa2hPnRToNiNNgSmc9PrIk5M3SQfY8uCAJJo1Ouze57384ytCKy2yY5LwKXcpb8buBZP7nP8CGKXBQKoLcEwjazEzvPPm2yyhNcLp2IlBxkOQ9sbhJfLO9XEJmgBDpbZP46C6bYrK89qFSpIb5Q218HvdBVFZSjiY4OXsJ5JtPLcqFKrt5YQiDJwx2wPRmAukWwbl+CPiZtQvkV82HHMaXR4dVcNHQ/klOU0pNPtJuDLNvAR6QcAHvpZAET3/v6RywaUY/l/GvM867VaaKD+TyeETsBUhiro1pwkhWmLO6xB3FRz39e8oWEIk7m5FWRf+gTvjuaPD7D+OEJ9YuV8u2HLXX4vDmnRep9rjZ3rp/wvnmbRgSp12WWysctKWOt+/L2U/yRRzuQoZEO+qtYbamrbNyJ+BCUWqUCbOh6wrBbKql6oBBCcvGzJrqJiHGFVP8KtaUxT8qx8IyKU7UeFw29BEhjMvy7Xfjr6aDssMv4oLYHRUaDCMnWvU101Rc mW3XYpY7 m+UWK0KbWG7b5VpGceyEO4TWzR3OQC5waZmFF3JNN9g9Kv6YPkdLmhtJ5LRii4bFO8spDBP5dqXRp61AG3jko6S6NxippThwL0/9y9s8VxLxnR0z28Fj970lP8cladErDkKYQBpTt5wmYCPFXdek4MCxSpDCs1UUrsM1GSXxghWTuwqHTrRNgPitSnSb10g5rD5DKURF9DJulWfeqWGFDDxFMI8kysfPOeqFXRGOdRExavl1LW4d1jck0ceOwRQtZhO0vpzTx73CkVtYxFhF1WP/lulkQwaI3/2/RW7YRII/k1F4rDquZ7KO3DGvbDWnzf9KmpTX3Yjd3q6GnklByJ6n238531FC9N8/mRU7h91k6NLOvu2cT3Q16VKugFNN+csVPDzKUY4tEfDehTwejYy9F/jOV4JtAviMdJQjuT33jk/jBHoBX1RD89p3iBQaF5EE1r7aB3L3zpWSaE0xSxORo0AdhXNO5g9KfVdcJ55PLiXbqmfp0yK+pgWoxVIKnRDWI8LK7alOcO2QnKnVU8Rus7TLjMbiQLbIDRQmlLqXddu/RRsMRFLSXP0LgOkSh24z7YAUDdwAa4OzPU2VpltFir3n8vKec3127gvMGmpXYIbDKebJlsFssWPIiP9WGHoydhPqghA//KJzi3yLLwkp677mVXuueUfGkzd0RQAJKLN/cM3vlGSK0/xN2y34g2tPI4xzMDjqv3Zg= X-Bogosity: Unsure, tests=bogofilter, spamicity=0.495037, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently, the binder driver always uses the mmap lock to make changes to its vma. Because the mmap lock is global to the process, this can involve significant contention. However, the kernel has a feature called per-vma locks, which can significantly reduce contention. For example, you can take a vma lock in parallel with an mmap write lock. This is important because contention on the mmap lock has been a long-term recurring challenge for the Binder driver. This patch introduces support for using `lock_vma_under_rcu` from Rust. The Rust Binder driver will be able to use this to reduce contention on the mmap lock. Acked-by: Lorenzo Stoakes (for mm bits) Reviewed-by: Jann Horn Signed-off-by: Alice Ryhl --- rust/helpers/mm.c | 5 +++++ rust/kernel/mm.rs | 60 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 65 insertions(+) diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c index 7b72eb065a3e..81b510c96fd2 100644 --- a/rust/helpers/mm.c +++ b/rust/helpers/mm.c @@ -43,3 +43,8 @@ struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm, { return vma_lookup(mm, addr); } + +void rust_helper_vma_end_read(struct vm_area_struct *vma) +{ + vma_end_read(vma); +} diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs index ee1a062ec7d7..60dc66972576 100644 --- a/rust/kernel/mm.rs +++ b/rust/kernel/mm.rs @@ -18,6 +18,7 @@ use core::{ops::Deref, ptr::NonNull}; pub mod virt; +use virt::VmAreaRef; /// A wrapper for the kernel's `struct mm_struct`. /// @@ -160,6 +161,36 @@ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser { unsafe { &*ptr.cast() } } + /// Attempt to access a vma using the vma read lock. + /// + /// This is an optimistic trylock operation, so it may fail if there is contention. In that + /// case, you should fall back to taking the mmap read lock. + /// + /// When per-vma locks are disabled, this always returns `None`. + #[inline] + pub fn lock_vma_under_rcu(&self, vma_addr: usize) -> Option> { + #[cfg(CONFIG_PER_VMA_LOCK)] + { + // SAFETY: Calling `bindings::lock_vma_under_rcu` is always okay given an mm where + // `mm_users` is non-zero. + let vma = unsafe { bindings::lock_vma_under_rcu(self.as_raw(), vma_addr as _) }; + if !vma.is_null() { + return Some(VmaReadGuard { + // SAFETY: If `lock_vma_under_rcu` returns a non-null ptr, then it points at a + // valid vma. The vma is stable for as long as the vma read lock is held. + vma: unsafe { VmAreaRef::from_raw(vma) }, + _nts: NotThreadSafe, + }); + } + } + + // Silence warnings about unused variables. + #[cfg(not(CONFIG_PER_VMA_LOCK))] + let _ = vma_addr; + + None + } + /// Lock the mmap read lock. #[inline] pub fn mmap_read_lock(&self) -> MmapReadGuard<'_> { @@ -228,3 +259,32 @@ fn drop(&mut self) { unsafe { bindings::mmap_read_unlock(self.mm.as_raw()) }; } } + +/// A guard for the vma read lock. +/// +/// # Invariants +/// +/// This `VmaReadGuard` guard owns the vma read lock. +pub struct VmaReadGuard<'a> { + vma: &'a VmAreaRef, + // `vma_end_read` must be called on the same thread as where the lock was taken + _nts: NotThreadSafe, +} + +// Make all `VmAreaRef` methods available on `VmaReadGuard`. +impl Deref for VmaReadGuard<'_> { + type Target = VmAreaRef; + + #[inline] + fn deref(&self) -> &VmAreaRef { + self.vma + } +} + +impl Drop for VmaReadGuard<'_> { + #[inline] + fn drop(&mut self) { + // SAFETY: We hold the read lock by the type invariants. + unsafe { bindings::vma_end_read(self.vma.as_ptr()) }; + } +} From patchwork Wed Jan 15 13:35:08 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13940448 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFA76C02185 for ; Wed, 15 Jan 2025 13:36:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A97C06B0092; Wed, 15 Jan 2025 08:36:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A19446B0093; Wed, 15 Jan 2025 08:36:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F6846B0095; Wed, 15 Jan 2025 08:36:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 5A5A66B0092 for ; Wed, 15 Jan 2025 08:36:19 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 145F8A12DE for ; Wed, 15 Jan 2025 13:36:19 +0000 (UTC) X-FDA: 83009785278.05.7FDD04B Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf16.hostedemail.com (Postfix) with ESMTP id 2572218000D for ; Wed, 15 Jan 2025 13:36:16 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=cYtsF1vk; spf=pass (imf16.hostedemail.com: domain of 3z7mHZwkKCKQEPMGIVcLPKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3z7mHZwkKCKQEPMGIVcLPKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736948177; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9rKypNB63ajEZUNiwKcfWGkLNs9kznO8TMtZ1K+RxGI=; b=jq48w3RrcKau/RSthfsiYtAnup0WAZ6+Qq9iAgaowuHHh1F/tXf3UQIOX38u/r5K/1uyw6 0/Yz6YddKFaLXZDOcYeHEYILf27n+NAK3dmWFMP8M0EGdg800VJlKG8yxvk7laKYGt4h0x d0J3W9k445w/vnHIWDYLXYM9KaQ3eIk= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=cYtsF1vk; spf=pass (imf16.hostedemail.com: domain of 3z7mHZwkKCKQEPMGIVcLPKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3z7mHZwkKCKQEPMGIVcLPKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736948177; a=rsa-sha256; cv=none; b=KwL3xAWtx/OA3gc+Qp9m9kNaiaIGcZ1H4bNz8amIw6Q/vGBopvZ1QM/7t5NZVyTH62L2m9 2N7ulh2t2fk57NXm42scysKgOpUD6rAAitf3O9JskGWWwCIr49pmpSIEmCGLUhjh93N+Hf qyw04hMUYPb1WnXz//Nn5ZFdI0ho1k4= Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43623bf2a83so53778485e9.0 for ; Wed, 15 Jan 2025 05:36:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736948176; x=1737552976; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=9rKypNB63ajEZUNiwKcfWGkLNs9kznO8TMtZ1K+RxGI=; b=cYtsF1vkVJljIBPrgSjoP92p9YW0slNJ4mc8P3Z4cxLYSBh6qb/WEiXtFEdyFIFxCF vftw80zp00RZ7hNsKBbUdr+rw5YcI1qhy9tKdGG7utm47W/nlJBIztj2PAuqF+aYiLct jNdHqaqEV8vyPIBbaxxLFIE/LvF6Ox/fbUVH74jDE65+d5MKjiPYFgWbNNOok8V4NrKZ WQHOsF+2DpEyOGRlupubcGjcFQ75yYuDnz4vu9JK77gCVhoHXOpcsqGSIZ089wsWLObE S9NrSXN1sS8xpQMkt6r0LdmrRN9eQUjP5lHOEJqOUA+CAL5pIMXkECMOBUDS/VpD0md4 KPMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736948176; x=1737552976; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9rKypNB63ajEZUNiwKcfWGkLNs9kznO8TMtZ1K+RxGI=; b=pIy6BMffp4DGfb/JPzNJq29gTWYgYZUAInNBGl+2KFNA7ffHIfBHMGH14TN/5OFCAr NDvNDiVO1BPm8Y+ukce8xYBUbiYG2WS/PzenmM/rSpqxUO4kTfeHYRnLSuNpTapUBX65 HqxZuYej4iMjErdV00gMuO8NfQDvqW/touKmtmknFTdzjZSgiDSHbEIGOMVEohnZ5p/I nqjDuJkniN1h5vrUjHGwMG1fAJSDoG8Mw2Muf2ShoDCbx7H4vW+KMAVSgiDCknOge0Ag g8AEmLPkMOW2Jn02eVsq7w8vLLukWiSAQyU3vXJiIOLsy67eX273ECKbsnhufbJARz5R 6LZw== X-Forwarded-Encrypted: i=1; AJvYcCVfgHoeOZdxB5kEhXDdf+cxGe2+Y5MW5TiKjJAyRVXDwi67p/xVvIKnona2GF1tQeHCkMOihd6IhQ==@kvack.org X-Gm-Message-State: AOJu0YxdYsoVB0k5GGeIlu9toPU4svRuqSvZOmktGRAxHlJddc3CkvQX v86Jr4UUC9TjclmFZjeGy2KO4wBTB+b+GTJmJzOuboMX402TEUPwRaQf+fyDGQLLgIGKDWEJ/Ge SO0UZOLuGEzoMpg== X-Google-Smtp-Source: AGHT+IGsuwZ1l0g4W3y3Qu+FgJ8iww+9A3FLFIVNNL+wg7yBLfHHRHUpzUWCiouwB9ZiGGEoPnqIBOqi1jWjsVE= X-Received: from wmbfp18.prod.google.com ([2002:a05:600c:6992:b0:434:9fab:eb5]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:348c:b0:436:ed33:1535 with SMTP id 5b1f17b1804b1-436ed3316a4mr227011565e9.12.1736948175968; Wed, 15 Jan 2025 05:36:15 -0800 (PST) Date: Wed, 15 Jan 2025 13:35:08 +0000 In-Reply-To: <20250115-vma-v12-0-375099ae017a@google.com> Mime-Version: 1.0 References: <20250115-vma-v12-0-375099ae017a@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=3168; i=aliceryhl@google.com; h=from:subject:message-id; bh=30XR2NvyOSBS+jDMS4+B8d4GBDe/RpEg1dR1knprL04=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnh7m/n84l75cPhMeQdlYkKqGCd0G/G16B/EPtL 8EOZEg0z1eJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ4e5vwAKCRAEWL7uWMY5 Rh8YD/0UTFeQviTFRZ0rYolNkH7u7m9ndG6cRTDSh6VUOR6SOTsz3T0HZAXgaIzUB4ansQZ5FZa Hj2AfkIysDwMHYRE792KZNGtY4D5HnhPG0UvY3yCPTVplkT7QXvyCE+/xmFL0Nd8aoJGRKv25M0 hpm0ihktW9H6xXBBuHgGiF2IVx2QRxsUIWDMw8b+1sByiPrjMoc8QhkOcAxiG5bPrTtgdyjuwkz HbFI8TvbCblTU5qgb1syg8w5crYZiXH3RAnZWA4Nykd1csEn6iNepKikxxADNbht3qjj2/owiAQ TXs26xwpbh5H6M3dwl0QOArMRffQ602vtsmTd2zeAjCpSas1BGt/e141lB2pzMImtmmCM7OFu7J 1OeELsA671/xBaQLwFxqyOhI349LfXixRgGleNTwTdzOfMq3bUvzUEcAenXA6N5fxv78lee6klv OmtUgXTkopsrc7fV/IMu+mvhxVBEVYbflPUoPEeNW6R76MEldQR8HoDvmmKweZt3bGRpuR2F7kL Wo6Q27nXpOZsknq2dCGwxH9Y1DyuIT3y0ah+dYMulo2AQy5jM5iZZBM2Yp5xqHJUPW9tazwd6/8 H63sV0iAZhbIci7jP6KlcVGERq6kLTyAfsz3NBmmfedhmfaLo5Ph0qm3v0+wRTr3c9YIljrqHLT qOJ+CFShyk701OA== X-Mailer: b4 0.13.0 Message-ID: <20250115-vma-v12-5-375099ae017a@google.com> Subject: [PATCH v12 5/8] mm: rust: add mmput_async support From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Rspamd-Queue-Id: 2572218000D X-Rspamd-Server: rspam12 X-Stat-Signature: u7wo9nbzqf8uuq8byu7y8gemgumwnkff X-Rspam-User: X-HE-Tag: 1736948176-407360 X-HE-Meta: U2FsdGVkX1+mAypTV+TW1GIaKkrNPhePjs1Rnf9aKIGkMyYjaaighmy5yDCNC9tKDMo5YPuC643Py1J10TWzNch+EqfvOOYC/MFt2sxOFTpcnLw6kMkc3Z0zXS6KQ4QwGA+ssAxv4lk/jBvqZhBBULIh6PrYvMhMkh0o3iTosM/YuCJ6YGxRoPjHjV824lajm0Xzl0TgmEgrqyQnTTTnCRy1lzZ2OTo9OXL76o+ZHoDRVHZn9FGVCEXVOlDApmnzedJRyl5p5X3cVyocuk/9YNG/wGQzqpcCeVs5oXMzSiX4TGQwhJxMXV+n+x3qH/MPEtSyMFepEXwjE0jeXD34rxyeh0W36i8O6eCrUwRxTSwZJXHLuiy4+33od5hQumvS4Rw6cT9kSeho2wt0Ue+KgWQ3E3MAhTfdiiyiPJCOcoJ2VdnaF67nGMbJ4RP+C7T1D+WC7PUbxelOHJAal00fjX779EPM/JUm6X/aHrOTDXL8enMjdVYTWyCRIBjWcqqvmhU3bywK+LZWrnXGT43EQx92ybM0/R95e5IluwYgVbij7ZrJP1ykUo8UaAK5Nfgngqhg9KRigtuVNUzmAdn1Wz5dFB3f+kdsWwT+kMpatS4L5eB1yiumghOY1dOqhEDqmam1lK6tmlV0w7sv3MP+gIYRWhhlfapsWsWlp3xsVQMHBsZVU6Z98RE77WtzwZ3GqksNB3VptmWLKjyn1lIl8yS1WrsQUOBUAjnicJrGrXmYsK3FWSwXXAM8GYUVrTv9Fr8ClFmbY8GAxCr6xd4ELLIjIQgfG4ZQrw3X5Ftpsq2LocWx1bvoSkSgwRIeMWyXJL9JC8J3TYwAZOVBqA3S0qYwOEXpi6OCOY7i4wvCY8dMSV3rAghQujE0anRLy/MPGMt5dYr3Wzmxw6vFkPqPFuio2aIEpQmbPNLe90DwpoHLy2Dpmb2Kij6wHPkGXnKt2pU5vMKglgsSvWi17ti r1YZbDhf pmy8gWYnuIwNzViM5APhhp7uodV3GHXr/43qaqu9eKtJDkJajBPYQ4vto/ZOOxPbRUSakNtWf/wMIRYI3hqs+sN8ogTUkFLc4QXmDwYc3QXgKOccv0kEhXqLPG0HPsQC0XfZrv7B6RMECUGXD+w1Ypw1+/7rDwcCJRvVqQprL3zfeXaQ8MEDp4fD7SMe6X1PZw+JigrfzcvwIEL05X+hhSJBt63oPPsov3oqiaOCtRlCPe8xEz+KJ+ynvqIDiYxTB/TJYLGI6ZGtenKZg1KN1eLQApvwgUhcX33JJGknWdLEkU2uMvAooKaR2ZS7M/hASrZcK41THyWeg9+cyqrGkLerU8ryHZw/hNZ7zliubGXoZ0rvI3U01dtFrMsrU0P0lx42yz9ZSmj7PD0k/hrV1vW44QO90YX137j5o90uqydZUV7tXrPRY8AF3HlLtoryVs0fdyyUqMB0t3yITZNV+JlzqhlOUNjoav+1ePRk3AUdFyMEfsykU8sB/xNz3ESVqoLQRs1GQ7CiPe8H8hZuW8HN1c2NH4Pt+TM6qxTOFp0mi3hKo/mwQUbF3Ty4Nv3gURUIVGqzhIBQOlJCB1B6ruVkAlcHhzC47tkrCS5kMMpEZCkUfNQmVXajmFPKiXrCtQP0bc7BRtdZlXKfVZZLrUH3b6Z/E79TzLaryEZJf9h9/2CRMw5jIURPm52QP8hN5yh3pVlivppIiRPeUXIGHEzIaDMzjidBV9NVgwiCaEgRh1mw= X-Bogosity: Unsure, tests=bogofilter, spamicity=0.477451, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Adds an MmWithUserAsync type that uses mmput_async when dropped but is otherwise identical to MmWithUser. This has to be done using a separate type because the thing we are changing is the destructor. Rust Binder needs this to avoid a certain deadlock. See commit 9a9ab0d96362 ("binder: fix race between mmput() and do_exit()") for details. It's also needed in the shrinker to avoid cleaning up the mm in the shrinker's context. Reviewed-by: Andreas Hindborg Acked-by: Lorenzo Stoakes (for mm bits) Signed-off-by: Alice Ryhl --- rust/kernel/mm.rs | 49 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs index 60dc66972576..3bb2ccd5fda1 100644 --- a/rust/kernel/mm.rs +++ b/rust/kernel/mm.rs @@ -110,6 +110,48 @@ fn deref(&self) -> &Mm { } } +/// A wrapper for the kernel's `struct mm_struct`. +/// +/// This type is identical to `MmWithUser` except that it uses `mmput_async` when dropping a +/// refcount. This means that the destructor of `ARef` is safe to call in atomic +/// context. +/// +/// # Invariants +/// +/// Values of this type are always refcounted using `mmget`. The value of `mm_users` is non-zero. +#[repr(transparent)] +pub struct MmWithUserAsync { + mm: MmWithUser, +} + +// SAFETY: It is safe to call `mmput_async` on another thread than where `mmget` was called. +unsafe impl Send for MmWithUserAsync {} +// SAFETY: All methods on `MmWithUserAsync` can be called in parallel from several threads. +unsafe impl Sync for MmWithUserAsync {} + +// SAFETY: By the type invariants, this type is always refcounted. +unsafe impl AlwaysRefCounted for MmWithUserAsync { + fn inc_ref(&self) { + // SAFETY: The pointer is valid since self is a reference. + unsafe { bindings::mmget(self.as_raw()) }; + } + + unsafe fn dec_ref(obj: NonNull) { + // SAFETY: The caller is giving up their refcount. + unsafe { bindings::mmput_async(obj.cast().as_ptr()) }; + } +} + +// Make all `MmWithUser` methods available on `MmWithUserAsync`. +impl Deref for MmWithUserAsync { + type Target = MmWithUser; + + #[inline] + fn deref(&self) -> &MmWithUser { + &self.mm + } +} + // These methods are safe to call even if `mm_users` is zero. impl Mm { /// Returns a raw pointer to the inner `mm_struct`. @@ -161,6 +203,13 @@ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser { unsafe { &*ptr.cast() } } + /// Use `mmput_async` when dropping this refcount. + #[inline] + pub fn into_mmput_async(me: ARef) -> ARef { + // SAFETY: The layouts and invariants are compatible. + unsafe { ARef::from_raw(ARef::into_raw(me).cast()) } + } + /// Attempt to access a vma using the vma read lock. /// /// This is an optimistic trylock operation, so it may fail if there is contention. In that From patchwork Wed Jan 15 13:35:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13940449 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8581C02183 for ; Wed, 15 Jan 2025 13:36:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1B8AB280002; Wed, 15 Jan 2025 08:36:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0F359280001; Wed, 15 Jan 2025 08:36:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E888E280002; Wed, 15 Jan 2025 08:36:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id BE25F280001 for ; Wed, 15 Jan 2025 08:36:21 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 3EA23C133C for ; Wed, 15 Jan 2025 13:36:21 +0000 (UTC) X-FDA: 83009785362.26.4C10193 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf13.hostedemail.com (Postfix) with ESMTP id 4301C20006 for ; Wed, 15 Jan 2025 13:36:19 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=tbrDaU79; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf13.hostedemail.com: domain of 30bmHZwkKCKYGROIKXeNRMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=30bmHZwkKCKYGROIKXeNRMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--aliceryhl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736948179; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DamytgltoVfeocbQeMx8nCBv52GuJHF7pCRdLZ/tNgc=; b=TwQkrBID0ucuQNJ5Bdn7gl4ePPxQCJAmQ/b4m54TZZ8Adyg7hXtN0XgcdbJRLmXCFl3Wjx xPrlCasPZXcphlIcDcQQDdA9mvwtGox1JpOs1ewvz8ZSbp1leF47CHWFIDdkY8HgXfwA+C 1i69fH1dnRTG8cx+e5S6AIC+KthTpj0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736948179; a=rsa-sha256; cv=none; b=kIv16W5ay3RDUfOWWw1zkdLpynO1Lz7NerFL+4VJgUoCGqOeBeBX2ieb+is1jWVf1AORKS CJ87tQ9s68Hvu4IjdZOfTKUVZLbRonI6kWOwvH0S9qZOAkMwlzNWL+lUeVr29pMfZC0Wee F3K5cTgCp85pYL69ioPZAqO5/rmmMUY= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=tbrDaU79; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf13.hostedemail.com: domain of 30bmHZwkKCKYGROIKXeNRMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=30bmHZwkKCKYGROIKXeNRMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--aliceryhl.bounces.google.com Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43631d8d9c7so4407755e9.1 for ; Wed, 15 Jan 2025 05:36:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736948178; x=1737552978; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=DamytgltoVfeocbQeMx8nCBv52GuJHF7pCRdLZ/tNgc=; b=tbrDaU79uA+zRdTLXZTafMB34JMT+g9ZlFLXhtUpWcKzDGgY6jf3AAx5AJ8bfeuyRB lk0d+erYncLtFfRy/KDwdVHVfCPpgM2UGNFDJB/Mj7PZIVav+Da46C1aZCikM2TjNtD1 9PoMAWWxCsvpvtqtX0vJydWn+JpnoHkKtqnihIkdem0QuzKUC4Ws2c9/wEtKAg4cPrNJ 3Bfxofq3qzguY9oCKtAFchzOLSrziurQx3ljg2wZ/kwolCuGEAXmjrs4EEJyDtaClpjL UFwt731jjV5ZmpDqzFT9tGrZTDnuuCGufRltFbTd8bAO9ljD0vcrF0FN6ViFJUTIQ+Ta eQrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736948178; x=1737552978; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=DamytgltoVfeocbQeMx8nCBv52GuJHF7pCRdLZ/tNgc=; b=r15Ia080j5N7YX7F+2EJrIgpsh0/TSTrHFoZKEoy2Y8fPLUj9WJZko7vl0o5XnNIwd KnSpTIjRiVNYTDtFFkuLXWJMRxGwIRAJ6UQ7LotJZsTxYC9IiTqtW97+SCjzRCJrAFsW +AwxXuLzc2gPsuk9wV+6tXpLn0xhgNhKXvJcNlij6bhbl9gup5aFI4tEhT3F7UXDgsuS BUnnQ5SVrgHGJ02ThOvwyRbBRUBt07Kr1+i2Gs6NerXBbTSF4Lvy00RSY/jrFDXUaGna N2wg+KxTcaV3aH7QDygEZ4S2SWFsPackPY8fmeeq+Qe6n7U0yPkP4UyP4QTm5vEs461B lNFw== X-Forwarded-Encrypted: i=1; AJvYcCX/x0C0ns3DhXIl0Va6A6QK8XDe5/0IDZjPDI7gigkrh04j9ZVeiSvipOm69RCKp6t7YujoF54ggg==@kvack.org X-Gm-Message-State: AOJu0Yw2qoeHAxgwlQ/f8MjXNcxPt6WgaEXjpsYbuqzpEK04diap7Akj Gu3gzmlAbf3qg9R/YUPycJC6zsyBnXkoPeioTawKFww5AQc6/q6HvXFA6ToIe0lKSaOT0kLmaKm rKmbX0TG33TLEoA== X-Google-Smtp-Source: AGHT+IELx9AJKZV8sItrJFa+G/e/yL/ebbqw7wwoZFqBDXagGmIfBN3Ggcgo9ka6F5wgyrAjnxYDVlZBAHgwK54= X-Received: from wmbez6.prod.google.com ([2002:a05:600c:83c6:b0:434:fe2b:fea7]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:5cc:b0:42c:baf1:4c7 with SMTP id 5b1f17b1804b1-437c6af202cmr25099885e9.4.1736948177970; Wed, 15 Jan 2025 05:36:17 -0800 (PST) Date: Wed, 15 Jan 2025 13:35:09 +0000 In-Reply-To: <20250115-vma-v12-0-375099ae017a@google.com> Mime-Version: 1.0 References: <20250115-vma-v12-0-375099ae017a@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=8588; i=aliceryhl@google.com; h=from:subject:message-id; bh=XaI6NH0XnJmRP4+EcHKB+RqgempmG+WN3YGnzXiE0G8=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnh7nAGt/I2v83P5iCzUqj5ZXeXUI8e+dMSkVQz nx0HwcJCmOJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ4e5wAAKCRAEWL7uWMY5 RjcSD/47K/3DO60B2Djq16CVAghmS7WUkxL/INITcyHRwb6oJ8YyJhj8TK4JqkPdRZkgehr6WpE UqWj8HfVf2ETduU02zEMH7DeWGL0InDF98h6k1ceM5BxZJZv3YkLzK/RHzERmuSBXNAjSh6dvLX 61fnzSxecO8s+2Wpew8AmqVgWbBGAjYEi+3VElElWycZMHTQqsu4s4Y3/9wnRBXF68JnTcaDPvI L/yuifcqD1EC8/mF6NiQkoW5S+gXPUOSUefteN9/S+ShXCUtGla0pyK2RzzAJqt1l6kP3p/nyzO OtLgYmlG7lSBWtX3opa/E1iQFE81tTvgbN1VGWVfp+ZW+HsGc8+Nij/6GNrobLS4GI98/75fpEj WRB78vEl0vEgcfRMi8GXtPKFWr0wwj7jhGbRpGutntjl46DzYxn4/lVriP/oGiGGPQHKXYc7LHU S3IuTdyr5eU5x9otUQ7z46m3ppB8ZooIYC/KFEUisOz9qkL4CSyyViR6UyM7AdpTrxG4gZOFCqg BNcWZr2LtyQm1HNI6UX61cMFt2zWQawXDoraJ1vfiImRSjc+IAZnDYnEKFyfoblOnhgLKv4SWGT +aE4UhZ5F9fPg5rnIYwlrOLeEhbbZ+f2FYtfsvVCemJTdoE5MikdWKLFvmu1AF7yL3LKhnDJ4S+ wcRVfT/ozzI9kkA== X-Mailer: b4 0.13.0 Message-ID: <20250115-vma-v12-6-375099ae017a@google.com> Subject: [PATCH v12 6/8] mm: rust: add VmAreaNew for f_ops->mmap() From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Stat-Signature: ogokj4az8sr69wy8r1bag7yckit387ue X-Rspamd-Queue-Id: 4301C20006 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1736948179-400459 X-HE-Meta: U2FsdGVkX18pijtKOx9DZmzCIcPEOYxsvFMVBxnHnxmOMMaqeaWxX8IRazjobuPowEMCxLJOpls9OoyqA/ipKYGXcWJRt5Yfj5G63fY+cK4Iiog0/usMmmhI4f6Dupn5nwPdRcw+tRBy7/PqvjZLjCSCMrZXq5HqLzHqQ3twTU9eC/DY8G+MsLE8XtY8FSvF1EX81dtX/w1P1H2vZC82mhyCAklY2NwPJGeq6MfI/ruegEj8b24PdcsISiKlk3qiX5YGzBNAHwXGMC7nUjjMdIGuQ3k4khM+1zVwaWlErXSCPPkWS3Mx4lZxAKM6qW+wGhwzNwv9SlRR8sDWC/fuvsFmD/de6BFHvB2kUQJeZ1X2f58xz/eZpaNZeIfBAZqZ4tbOFpBIWZx0jWm8PN0CCyGAEFwupi4CLINlmcfrdB0dkUPpM2CA9/X5OGDu49SK1Pn5bhJRyM83gDoV5JrvI1zSFgKhe15ABUzgfGhErDoQ2bdAWNt1I94y/v96qWT8npiD2jPgyOBEjUwutY/5Na05XNh59wlF2PZ37JuGBzPbvDeKpVluPr4m1mXm9VFgL8KzbUYpS7/BVp3XoOVjmCOqHcSk+ltwno4ag6Dcgd0F8tX0D+vHr3cF9SXT90YSi2ogm1Ae8v8bjnD/QgzL/bT6OqflFomGdEV8aJPBTmNm2o5jZRwYrETuSX2kdijn2uRw19os9UoDng48+WeolwJwiTTKpkHrEnF65lGwmPF3G2qpRfY9bA93xUsOw8Sc6vdUJ8mGWP88rQESQDzv5AkreImfPyd3UJC+jk8FKZByEgEQ2eFXJfVV8JFn6hmU9OD7TjN6uVWIodr7oqnQur9tkQGHTp+KaOOMDmAX65LvuwS3JZQHsYGP7MeeL++7M0Gv2OVXXst8TkZPRvakQbMkecK/eK4uKdzQqPNc9AR63lqt3l6/+7LJvaXcLMJvPaWPReACRB+vzOx4kVa hkzYwXXR v2n5ZLEKzzoCvFb359+PGY6/zNRCjNhbuYQ9fBOrWot4Mt+W6wRls/vAE0/M89RDfN2xokvxrmuUgkOaV3V3tUmfJODDlPdgMqaUtgm4NXxDuezE8LGL72cGgfHwX3eUR6LBQ1gT0hX0NsTSHOquMdpsP8931V5A2lqr8pSVkbWxG2nV9oCHOojsjm4hEjY24wJJB0zcbv3ZUKoClWibL6aQTFDtugSDknrXewer5GqJ6XMkTB5Ny/QYgjBaafD/+Ps8ddStiX81UpIC6urfIPEz7K5APSenuDBJpab6anvQj6/bwtsnHX3IgN6nGOnyUf5dAG/HgFF51BThGA2gfVlAMeN3juN0i46BeuLBracyX36Atckn8kyKR0geI8RRcgYZX3fvbU+6XB3sDsOGePrZFvJryuH5TPF1FNVnqAaCqzA+VXIeZi0KArW3oWYskIjTAF1IqsjJC5d88yeV0dQJJ2FgUQeW8pGeCfa4jknwd5FpNIWGk9HMf8qDkJEw3BHFw0wZDLc1zK4D74l8z6L0AOSspRsrQUiayWTkPI1HbsTKJp4rrDFVnL0v1v9DyDA42Dom3efC92qO0iGu0/EHwpAaWWPZkH4F6YjjInH65IHULWkVUiulAt8O4Mum4/tee/stCweU/yrbCIVxnMsIs0Ms+KZNdQCKd0NhiNxBRTXPF+G+Z04L2z4IAs37qEG3xdcuI8dvC6yw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.345277, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This type will be used when setting up a new vma in an f_ops->mmap() hook. Using a separate type from VmAreaRef allows us to have a separate set of operations that you are only able to use during the mmap() hook. For example, the VM_MIXEDMAP flag must not be changed after the initial setup that happens during the f_ops->mmap() hook. To avoid setting invalid flag values, the methods for clearing VM_MAYWRITE and similar involve a check of VM_WRITE, and return an error if VM_WRITE is set. Trying to use `try_clear_maywrite` without checking the return value results in a compilation error because the `Result` type is marked #[must_use]. For now, there's only a method for VM_MIXEDMAP and not VM_PFNMAP. When we add a VM_PFNMAP method, we will need some way to prevent you from setting both VM_MIXEDMAP and VM_PFNMAP on the same vma. Acked-by: Lorenzo Stoakes (for mm bits) Reviewed-by: Jann Horn Signed-off-by: Alice Ryhl --- rust/kernel/mm/virt.rs | 186 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 185 insertions(+), 1 deletion(-) diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs index ab89a526d3e4..ef940973e231 100644 --- a/rust/kernel/mm/virt.rs +++ b/rust/kernel/mm/virt.rs @@ -16,7 +16,7 @@ use crate::{ bindings, - error::{to_result, Result}, + error::{code::EINVAL, to_result, Result}, mm::MmWithUser, page::Page, types::Opaque, @@ -203,6 +203,190 @@ pub fn vm_insert_page(&self, address: usize, page: &Page) -> Result { } } +/// A configuration object for setting up a VMA in an `f_ops->mmap()` hook. +/// +/// The `f_ops->mmap()` hook is called when a new VMA is being created, and the hook is able to +/// configure the VMA in various ways to fit the driver that owns it. Using `VmAreaNew` indicates +/// that you are allowed to perform operations on the VMA that can only be performed before the VMA +/// is fully initialized. +/// +/// # Invariants +/// +/// For the duration of 'a, the referenced vma must be undergoing initialization in an +/// `f_ops->mmap()` hook. +pub struct VmAreaNew { + vma: VmAreaRef, +} + +// Make all `VmAreaRef` methods available on `VmAreaNew`. +impl Deref for VmAreaNew { + type Target = VmAreaRef; + + #[inline] + fn deref(&self) -> &VmAreaRef { + &self.vma + } +} + +impl VmAreaNew { + /// Access a virtual memory area given a raw pointer. + /// + /// # Safety + /// + /// Callers must ensure that `vma` is undergoing initial vma setup for the duration of 'a. + #[inline] + pub unsafe fn from_raw<'a>(vma: *mut bindings::vm_area_struct) -> &'a Self { + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a. + unsafe { &*vma.cast() } + } + + /// Internal method for updating the vma flags. + /// + /// # Safety + /// + /// This must not be used to set the flags to an invalid value. + #[inline] + unsafe fn update_flags(&self, set: vm_flags_t, unset: vm_flags_t) { + let mut flags = self.flags(); + flags |= set; + flags &= !unset; + + // SAFETY: This is not a data race: the vma is undergoing initial setup, so it's not yet + // shared. Additionally, `VmAreaNew` is `!Sync`, so it cannot be used to write in parallel. + // The caller promises that this does not set the flags to an invalid value. + unsafe { (*self.as_ptr()).__bindgen_anon_2.__vm_flags = flags }; + } + + /// Set the `VM_MIXEDMAP` flag on this vma. + /// + /// This enables the vma to contain both `struct page` and pure PFN pages. Returns a reference + /// that can be used to call `vm_insert_page` on the vma. + #[inline] + pub fn set_mixedmap(&self) -> &VmAreaMixedMap { + // SAFETY: We don't yet provide a way to set VM_PFNMAP, so this cannot put the flags in an + // invalid state. + unsafe { self.update_flags(flags::MIXEDMAP, 0) }; + + // SAFETY: We just set `VM_MIXEDMAP` on the vma. + unsafe { VmAreaMixedMap::from_raw(self.vma.as_ptr()) } + } + + /// Set the `VM_IO` flag on this vma. + /// + /// This is used for memory mapped IO and similar. The flag tells other parts of the kernel to + /// avoid looking at the pages. For memory mapped IO this is useful as accesses to the pages + /// could have side effects. + #[inline] + pub fn set_io(&self) { + // SAFETY: Setting the VM_IO flag is always okay. + unsafe { self.update_flags(flags::IO, 0) }; + } + + /// Set the `VM_DONTEXPAND` flag on this vma. + /// + /// This prevents the vma from being expanded with `mremap()`. + #[inline] + pub fn set_dontexpand(&self) { + // SAFETY: Setting the VM_DONTEXPAND flag is always okay. + unsafe { self.update_flags(flags::DONTEXPAND, 0) }; + } + + /// Set the `VM_DONTCOPY` flag on this vma. + /// + /// This prevents the vma from being copied on fork. This option is only permanent if `VM_IO` + /// is set. + #[inline] + pub fn set_dontcopy(&self) { + // SAFETY: Setting the VM_DONTCOPY flag is always okay. + unsafe { self.update_flags(flags::DONTCOPY, 0) }; + } + + /// Set the `VM_DONTDUMP` flag on this vma. + /// + /// This prevents the vma from being included in core dumps. This option is only permanent if + /// `VM_IO` is set. + #[inline] + pub fn set_dontdump(&self) { + // SAFETY: Setting the VM_DONTDUMP flag is always okay. + unsafe { self.update_flags(flags::DONTDUMP, 0) }; + } + + /// Returns whether `VM_READ` is set. + /// + /// This flag indicates whether userspace is mapping this vma as readable. + #[inline] + pub fn readable(&self) -> bool { + (self.flags() & flags::READ) != 0 + } + + /// Try to clear the `VM_MAYREAD` flag, failing if `VM_READ` is set. + /// + /// This flag indicates whether userspace is allowed to make this vma readable with + /// `mprotect()`. + /// + /// Note that this operation is irreversible. Once `VM_MAYREAD` has been cleared, it can never + /// be set again. + #[inline] + pub fn try_clear_mayread(&self) -> Result { + if self.readable() { + return Err(EINVAL); + } + // SAFETY: Clearing `VM_MAYREAD` is okay when `VM_READ` is not set. + unsafe { self.update_flags(0, flags::MAYREAD) }; + Ok(()) + } + + /// Returns whether `VM_WRITE` is set. + /// + /// This flag indicates whether userspace is mapping this vma as writable. + #[inline] + pub fn writable(&self) -> bool { + (self.flags() & flags::WRITE) != 0 + } + + /// Try to clear the `VM_MAYWRITE` flag, failing if `VM_WRITE` is set. + /// + /// This flag indicates whether userspace is allowed to make this vma writable with + /// `mprotect()`. + /// + /// Note that this operation is irreversible. Once `VM_MAYWRITE` has been cleared, it can never + /// be set again. + #[inline] + pub fn try_clear_maywrite(&self) -> Result { + if self.writable() { + return Err(EINVAL); + } + // SAFETY: Clearing `VM_MAYWRITE` is okay when `VM_WRITE` is not set. + unsafe { self.update_flags(0, flags::MAYWRITE) }; + Ok(()) + } + + /// Returns whether `VM_EXEC` is set. + /// + /// This flag indicates whether userspace is mapping this vma as executable. + #[inline] + pub fn executable(&self) -> bool { + (self.flags() & flags::EXEC) != 0 + } + + /// Try to clear the `VM_MAYEXEC` flag, failing if `VM_EXEC` is set. + /// + /// This flag indicates whether userspace is allowed to make this vma executable with + /// `mprotect()`. + /// + /// Note that this operation is irreversible. Once `VM_MAYEXEC` has been cleared, it can never + /// be set again. + #[inline] + pub fn try_clear_mayexec(&self) -> Result { + if self.executable() { + return Err(EINVAL); + } + // SAFETY: Clearing `VM_MAYEXEC` is okay when `VM_EXEC` is not set. + unsafe { self.update_flags(0, flags::MAYEXEC) }; + Ok(()) + } +} + /// The integer type used for vma flags. #[doc(inline)] pub use bindings::vm_flags_t; From patchwork Wed Jan 15 13:35:10 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13940450 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3F89C02180 for ; Wed, 15 Jan 2025 13:36:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DA128280003; Wed, 15 Jan 2025 08:36:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D2714280001; Wed, 15 Jan 2025 08:36:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B52B2280003; Wed, 15 Jan 2025 08:36:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 8B7FA280001 for ; Wed, 15 Jan 2025 08:36:23 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 515AE812C4 for ; Wed, 15 Jan 2025 13:36:23 +0000 (UTC) X-FDA: 83009785446.13.11F497D Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf17.hostedemail.com (Postfix) with ESMTP id 5B05E40014 for ; Wed, 15 Jan 2025 13:36:21 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=wAV5iyoq; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf17.hostedemail.com: domain of 31LmHZwkKCKkJURLNahQUPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--aliceryhl.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=31LmHZwkKCKkJURLNahQUPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--aliceryhl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736948181; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=81ZLKD3ogzg79yEJn00+1Q3wr2cvrl13TETNOCjT2u4=; b=FVtRd95GykkDhjlTsGbolARX9ww8fM2IufBx4fY3OUsEMg/KvMaxZ9nqZjtRy3FoFCMuFu gIE0klMYtdjCEV5REEvtAJulvc7FTgA0BbbjKL4fRTrs/dXBV3OihQSDju+p+7VM0Bfqus QQYhkI1ZQ/32ooss0GvdZY9SNtGCwOY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736948181; a=rsa-sha256; cv=none; b=X8msz0C4I9L3eoIBd/aspIOsb8kZm2/ePFcEznf0h+a30iEd3eT9IUKr4C/3EgT0ekVbSi XVvx1W6xIxZWDYmoMkmqmUksRyDVNF0vM/n/ecwTnMk+ku5k+UVoHvjYuZ59q2C+058CrS IEmbFrT4hRjDDSNGR5J7gptBHKI2qOg= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=wAV5iyoq; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf17.hostedemail.com: domain of 31LmHZwkKCKkJURLNahQUPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--aliceryhl.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=31LmHZwkKCKkJURLNahQUPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--aliceryhl.bounces.google.com Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43624b08181so4454225e9.0 for ; Wed, 15 Jan 2025 05:36:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736948180; x=1737552980; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=81ZLKD3ogzg79yEJn00+1Q3wr2cvrl13TETNOCjT2u4=; b=wAV5iyoqvhM1F/kLehQN3xM/7oNIW6ihPoqUxdTdQbqyJVS4LDiJdRCWK18DoYO2gS LVBtvCcLN6D0GzndvnvU67x5hLc3H0/NzEZWPzvBZOlPl4atk970esQd9erXVaUgkhu6 fawSOfos+ERe6mEWPvJyxhKIRn3QLVVxIqaADeoFFjGp8hEpf/ZZtLfR5FAXNbVnlUZ4 heLRPth922fXDqKpSOh0YIYXKXDP15/Xij+/+13uMO5nzXPLopeXsTx/s94J5qEorw1u EcIkZTNRz+ZB/54heh6dWDqSCkwXIEexxxBhjCKuHv+NW8fFQsUQjgJKYyDiDWHe3v5Q EGKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736948180; x=1737552980; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=81ZLKD3ogzg79yEJn00+1Q3wr2cvrl13TETNOCjT2u4=; b=FGQw2BCwV0GLNXV0yjKSyK5sGU930ivflaJCga11jytyRMyGMewJ64y7Sy6A9OhpVP VUwDzQa4x5P5v9u5vdqom/UkSUd5qMbjsfS9I//LlJau+LOKwgP+U6tf7TSvWbNT0RkP jzA02tDN5eO3Wpe6wGK55u8KWpaoW0zGzkyD/XHCtLpKxOmUMssZ0olIupxBlS6Yoafe qgaTEfekIUPolz8wK/wBveURJALaccCgRhNFNO0BiVovLfO9GeOFxZIZgYnU2nxyAkMO NJNBZoHuPzkkN4Cx980EEhDGsTDExRcON0XFXuYpK4Qy3iXnfSMtMK6wqfyQtTNdCug7 Ym4w== X-Forwarded-Encrypted: i=1; AJvYcCWPzRyWfuwyzJm9Zf/13srHG+D7QuAfAcnuwfRKBJaArDws6BbbjIiW67ItXMp6w/Y5YRkr9iypdg==@kvack.org X-Gm-Message-State: AOJu0YwmvzLqE3PYa6y+3EjMUF2L9aKK/mn6Mia/t61XBvCvApaqqwa1 6lZEp5gRw3a657RDrndC37bGjljAiSk+wMjsmodehGnEGc4YY5L1XGRrAiflzaUDoijXGBRaehd 0qfnpmMIcmVFeXw== X-Google-Smtp-Source: AGHT+IHgYgEKSPDdi2P+oyelNydwykWpCSvtY+CbLbji5X9hWCVjK6MNYOAiA3I1gSkw4AjTEKFL2AALGCdOpWg= X-Received: from wmpz18.prod.google.com ([2002:a05:600c:a12:b0:434:f299:5633]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1e29:b0:434:e892:1033 with SMTP id 5b1f17b1804b1-437c6b7b184mr26161955e9.2.1736948180099; Wed, 15 Jan 2025 05:36:20 -0800 (PST) Date: Wed, 15 Jan 2025 13:35:10 +0000 In-Reply-To: <20250115-vma-v12-0-375099ae017a@google.com> Mime-Version: 1.0 References: <20250115-vma-v12-0-375099ae017a@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=3771; i=aliceryhl@google.com; h=from:subject:message-id; bh=YABlKOR4s3jcHAdrd+4knVVhbkpmpWt8xx350u7Y5JI=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnh7nAleiDJcaQtJuPa+I8hS1lU1tl1JfuAuv6C fM9qXyv8SiJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ4e5wAAKCRAEWL7uWMY5 RjKWD/963spEQDrIssZkZ3POnDl7uN2xDsLrbFE7PWw0knySvUIAaNYgn/o5eNYUfpxiQLaCKkD CXGxsKIuf60CrxosESAo+b4ZUrA3IUYF+nX7NShS9KEgo62sv6MnBwZclUyx1Gj55nj3aNveugE fy0OodhenOJTHWSU7N4gbqJlMKsNFxFMoLCMphn2Mle3wqHMGd4jGi9IRXspxNU3pWa6ccMfYo3 LqkkBCs65pX3GBKgCt9OByOiUNBtC84SEfpNjTar6ZqPx1wR+7nJO6U8hxO+jc1KdErXJ1KwxZS mSQXRQDKLpBln8f9ZazAF73LtchsKtETNLrTD18PNCkfZkh5wxnb7cHNkK8CC9ULX6quxyPzmCH QO0rDKnCD2fPfoyJWdf1xnjS4Z4PKm+TesBB3ihfWES2VoSsoD34cnbq6uE/opRmhvNYUhQvfAu CV1nQnnDyugFYDTO4gGC3FmJ8dWJNPy2DYEtQSMhbyHFqI/pGeTc2+CLwWlrsBob7NPgw+KPJR1 vwl2jg1bbS0MmLM0UKpUFO+RYxBg6HaNBwdr0R5RLmkmuJkqWdEk3wOajCyuMxCP1IWmmLymZya qtAJCN8TF+CkyYnG99EOWqEe/MlfaHTX38XNA0sreMyoYwYAsBgayLu4dFvPIME3w8Cko/T6+o8 +ejB8bK+Cslj+dw== X-Mailer: b4 0.13.0 Message-ID: <20250115-vma-v12-7-375099ae017a@google.com> Subject: [PATCH v12 7/8] rust: miscdevice: add mmap support From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Stat-Signature: 5sxdydzfeyjit7daqpw59r393yd6e6op X-Rspamd-Queue-Id: 5B05E40014 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1736948181-581782 X-HE-Meta: U2FsdGVkX1+rWLVE5wi2OjWXvya/aElIJpGpOeOJJa7A4isWMfe3EC6OvE+/y/2NM8wb7DCAEEA3DkIPdLf0oLN27VoCk0kxPw8IUR3qjcnz2EtZ+CjVSUhCEXOyo+w+VRFsbHtESBTiqC4oufcAbjVw/okH2xC9keWt/pacHS87jyOm3dDVfGbf7wZo3w7a4jjdRxWyNVPIrXAIImugqq7+0HLEiaxf6B/nj04PpfYoAIU0Dg9MCheTCBd+Ie1t0jIqZcUAcosayqmE681DYD4oyUX4QHu259K5ZNC/79eJK6iN+RyUVIGwbn9guA4al8rgW9z2KtWRfHuzVyOA/9n5JU6Qtqr4SHZ262Hu2KUwRaYvgWB1t5v1XzjdZxISuv3sFzOrg0ENOYnm6fnMHvTlxR7logF7V3ilOr4j7RQPtZyZlQN6J31KvlkbYqx2DNYaQTkRt+JRY7yoN9FAy+KIq+KbCQAbHmippyvd8T6kXGFmTbhO6CQcSHhQnZg01GvEecmNbHG4DvjSxlhIr3lM7jYbjZFB29UdvQQWOVeBZ8vU64JBBoMiTDe6FiI2n4/9ffEwRmBJZVixI/6bfN3k1n+yf2HIQRLWbOqZIrKMsBFOFaHOuB4vSfAtOWyas062uSSGMmxNaLKKQXb/spx44hc1/nSepxw95ZS9HUoOFbmUWwirQ3tc049wX7mfiUKWa+eTO4q5xhP+KKmV43AlFp4PLk+PI7o/kX4mEvPay6Lbo9UZ/aMIGGjpndRQskWUjfd2OQ/pTKoIIcTaO17tN38I+ZOmnXipPwvQGhHMYlehfzc/k69v1/VaQOCJmLv9yAy6XoKNpEUkphGSZkqwlLeWjDFiehdhAbwkjsSHdxAM6V3bWzMGwIM2b2196m3XEJdWeFCNTEYncg/vreaNmy3/kBrGdAp1vJ6yLnRJrW27WIqa7H7kzoyPv3pPUWs6Sn18nYjYadcowf6 EYCXdN6L O5f6SJLTWA+GclPC5N9ENj/LdD+UucePA30c0Z+5krmOvjFZ2yE41bmkY8hC2vqu8p4HywND1iqeLOQjMsrim6jm9jACKjg2odLV/RY8wJruk/tdq0AtNggFdHcnzQcmvmtPGMtoPkrJDMGzbFNlAoePUR/ZRxBz/ZQ6uHKdKhdRazVdVYZGL4/ecer9ZZ4R7+XnRnsugOJ45Zdr4/Nmdygrk/s8EQ9X8VpX+gTsfFHMiMlpuycMm9X36nG8RuiSId3BR94JiEpLTLyOga7LAIwikui7vDnVoMOUY5SwXfiOK+KSs6yXXXeL/0hso4sb16kCcKpv2hKgbAw4BkmIr51/60aDvW0u0Op0rU3TxqMk+GtWTi1nlybAfuVi/cp9rH45heSI0lTbITmx+NIXeUzYHrhZyN68J6pn6jmCfKgkgAM0yH0L6LjxsO6/hjkr5Mq7rDhQE0s9nKayAWGTnS2UGXNttRYTCAFjzZ2wB152f1Sqpk3KHjmb9EKwYIsBCg9ISaE2DBx+BaonZ4R5ieFn9JOJWNd17Ay6Gg3sLQSsa7xsDoYSTWe9R2nJS46TIKHbGK9bZ/u9LIFmExDpqXT5FHWRSC4RSbdbNM1a/PaZ0+TxBY1E8uyS4+JVlbUYdzpLl8KgaYOBY26kBr8U2puTi1kihIup/9xF8 X-Bogosity: Ham, tests=bogofilter, spamicity=0.406004, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add the ability to write a file_operations->mmap hook in Rust when using the miscdevice abstraction. The `vma` argument to the `mmap` hook uses the `VmAreaNew` type from the previous commit; this type provides the correct set of operations for a file_operations->mmap hook. Acked-by: Lorenzo Stoakes (for mm bits) Signed-off-by: Alice Ryhl --- rust/kernel/miscdevice.rs | 45 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/rust/kernel/miscdevice.rs b/rust/kernel/miscdevice.rs index 7e2a79b3ae26..0a0ef172304c 100644 --- a/rust/kernel/miscdevice.rs +++ b/rust/kernel/miscdevice.rs @@ -11,6 +11,8 @@ use crate::{ bindings, error::{to_result, Error, Result, VTABLE_DEFAULT_ERROR}, + fs::File, + mm::virt::VmAreaNew, prelude::*, str::CStr, types::{ForeignOwnable, Opaque}, @@ -110,6 +112,22 @@ fn release(device: Self::Ptr) { drop(device); } + /// Handle for mmap. + /// + /// This function is invoked when a user space process invokes the `mmap` system call on + /// `file`. The function is a callback that is part of the VMA initializer. The kernel will do + /// initial setup of the VMA before calling this function. The function can then interact with + /// the VMA initialization by calling methods of `vma`. If the function does not return an + /// error, the kernel will complete initialization of the VMA according to the properties of + /// `vma`. + fn mmap( + _device: ::Borrowed<'_>, + _file: &File, + _vma: &VmAreaNew, + ) -> Result { + kernel::build_error!(VTABLE_DEFAULT_ERROR) + } + /// Handler for ioctls. /// /// The `cmd` argument is usually manipulated using the utilties in [`kernel::ioctl`]. @@ -156,6 +174,7 @@ impl VtableHelper { const VTABLE: bindings::file_operations = bindings::file_operations { open: Some(fops_open::), release: Some(fops_release::), + mmap: maybe_fn(T::HAS_MMAP, fops_mmap::), unlocked_ioctl: maybe_fn(T::HAS_IOCTL, fops_ioctl::), #[cfg(CONFIG_COMPAT)] compat_ioctl: if T::HAS_COMPAT_IOCTL { @@ -216,6 +235,32 @@ impl VtableHelper { 0 } +/// # Safety +/// +/// `file` must be a valid file that is associated with a `MiscDeviceRegistration`. +/// `vma` must be a vma that is currently being mmap'ed with this file. +unsafe extern "C" fn fops_mmap( + file: *mut bindings::file, + vma: *mut bindings::vm_area_struct, +) -> c_int { + // SAFETY: The mmap call of a file can access the private data. + let private = unsafe { (*file).private_data }; + // SAFETY: This is a Rust Miscdevice, so we call `into_foreign` in `open` and `from_foreign` in + // `release`, and `fops_mmap` is guaranteed to be called between those two operations. + let device = unsafe { ::borrow(private) }; + // SAFETY: The caller provides a vma that is undergoing initial VMA setup. + let area = unsafe { VmAreaNew::from_raw(vma) }; + // SAFETY: + // * The file is valid for the duration of this call. + // * There is no active fdget_pos region on the file on this thread. + let file = unsafe { File::from_raw_file(file) }; + + match T::mmap(device, file, area) { + Ok(()) => 0, + Err(err) => err.to_errno() as c_int, + } +} + /// # Safety /// /// `file` must be a valid file that is associated with a `MiscDeviceRegistration`. From patchwork Wed Jan 15 13:35:11 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13940451 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90506C02183 for ; Wed, 15 Jan 2025 13:36:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 51328280004; Wed, 15 Jan 2025 08:36:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 49A79280001; Wed, 15 Jan 2025 08:36:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2CA18280004; Wed, 15 Jan 2025 08:36:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id ECC41280001 for ; Wed, 15 Jan 2025 08:36:25 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id BAB6A1A09F2 for ; Wed, 15 Jan 2025 13:36:25 +0000 (UTC) X-FDA: 83009785530.11.5EF5485 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf14.hostedemail.com (Postfix) with ESMTP id B8FB5100005 for ; Wed, 15 Jan 2025 13:36:23 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=MAbcOSnP; spf=pass (imf14.hostedemail.com: domain of 31rmHZwkKCKsLWTNPcjSWRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--aliceryhl.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=31rmHZwkKCKsLWTNPcjSWRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736948183; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YDhR0wZS7ts5Hgraw2zWBvxuj4q+PFH0DkTWwrU34Og=; b=pu087TTrMJUR6mS++4QsmXS/EZb45AmSuc54B3fxMvq4Xt+29cXpPLIuTD4OSm6CYuAuxa dZpSTOGLP5y80KcNSbPkyJbnM79B7EKeSF2IQ0jelIAoKoqF0idLYlzBzItw9ax6hDqgz/ yIU+mgWxgvctqqOVsJaATEdl0AYw3uk= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=MAbcOSnP; spf=pass (imf14.hostedemail.com: domain of 31rmHZwkKCKsLWTNPcjSWRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--aliceryhl.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=31rmHZwkKCKsLWTNPcjSWRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736948183; a=rsa-sha256; cv=none; b=RI8FjMzOxOp4KqA3fefDnPJtrPnuSX+lRHK2Oezcrfelv2q0g4P+4xZCMClKcRFvMFtzIl Ngopxr2xEayETaiIRT43/EJpwVnGoZWet0CMwW1t3vvK79fwgPULLXphSc7ZIhqU1yODGi hLFsPZPZTjJFxL/sOSHvf5x2/Zgkr/A= Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43624b08181so4454475e9.0 for ; Wed, 15 Jan 2025 05:36:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736948182; x=1737552982; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YDhR0wZS7ts5Hgraw2zWBvxuj4q+PFH0DkTWwrU34Og=; b=MAbcOSnPgLVwfWApBnKPRGPxYKNIG/4mV6OzDJGHm9T4HtzMGE34MiVHdLWcMYEnFz E+SmpeZdnDHCTU3d7LUmgG2rgRZ57aLaEtyihVwY3JL4oLfixXFRC0QFcyZb37gZljW/ gt3IUkMgYrMwHfecICvYBI/3YaFxpjGQlpQGRn4nIUEsVLe4NQilBnyBfou28dn071q7 pqYhMwYhNuFBF5JJCigbOXweXUZjUXWtA5LqzH5Vc3E9vURnUB4lGCvUWvV93AyKvi/c 9V55pXQ/DPJzN1m1RY1T5tq3kho31mejhwYVHuwVjxF1i5KRXa931DYe8+WpxPrl98aV g+Rw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736948182; x=1737552982; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YDhR0wZS7ts5Hgraw2zWBvxuj4q+PFH0DkTWwrU34Og=; b=A51U5YzzZw4XtAbgkxCVvY6RMbHYNjkg8ulXbVlOA34SxeVe8rLU5q4+SDmAhtyx8i jN/nARYnc3sbRjLHgizmaiL0zUEY1ha2wSfgP5bzkhiMMoiqvQ1c0jjDsvU6qfprTH6i w19c2Vpg+iBS/P+dtgU6bTzRKF5pHR56/Ong896pmMUz5fThFfSchunacAtmQIkvy1JN k0Mh35qc7yRWatlRHgn3Gqb0OSkoNbOBaNqr/FrS4+2PQhkkr4CDpN1T6VekdK3OUD1+ R6ODRHiP+KNqKr+XVc+zR4ako+3Yuu/S3M9BJ4UlyKWKYe3mSq1x7H6LQfkRuxb82XZ9 vQWA== X-Forwarded-Encrypted: i=1; AJvYcCUYXixPqynwSw66BBKW3viRKW1jR0FE5ry3wOb+uZtE2aGPrXpywYgzmYwLMyzA5OEPwWSJ+Pvf0g==@kvack.org X-Gm-Message-State: AOJu0Yx93+6B8ELgLHgjO0Vesmq7XAw0PqCbDrIFf3NKHQZTyJ8mX76Y Y1ac7B0+VgiNmtU30+6WiEkWaCZ43fPml+wkXO2aXSxOtecozpp5OWBSVr+SBbq2e2/2gNIJxw/ Hfuo/zfNbIKCiFQ== X-Google-Smtp-Source: AGHT+IGFHQJdwt5YPoPxZfBaIutfQ8q4HgoaMKHTjrC0thoiPd+KnDkvpCu6E3x9F6CpQFEGaWKYx3hHr0HQTaA= X-Received: from wmqa11.prod.google.com ([2002:a05:600c:348b:b0:434:a8d7:e59b]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1588:b0:436:747d:55c9 with SMTP id 5b1f17b1804b1-437c6af20d2mr23578795e9.5.1736948182452; Wed, 15 Jan 2025 05:36:22 -0800 (PST) Date: Wed, 15 Jan 2025 13:35:11 +0000 In-Reply-To: <20250115-vma-v12-0-375099ae017a@google.com> Mime-Version: 1.0 References: <20250115-vma-v12-0-375099ae017a@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=18083; i=aliceryhl@google.com; h=from:subject:message-id; bh=8+9yOLbQvy2W2D3bb6/KIfbQXKfIeZCeNeOPvjLjO40=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnh7nB4k4e6GSUVjgK9NZzvzQOqImChi4oNAUub MRzxgN0iamJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ4e5wQAKCRAEWL7uWMY5 Ri1iD/47/BY10kHU19L8JbN8KWkVyDiDY0mn9KC60V92hs44qQhmMFGFVpYGuixI4cFmAId3ZYa 0cBeFxMG2RRWp0UYQqaPdkXv0laWFtK3CsgFaM79vu0cNHOIYNM+JM6R2pGzy/fonmaNJ6+jbpp +r+idgCFwJkGC3XAzP81VEfY5dE20IvewCYxgUIp9tv4h7NC1zUl1FjZx0Vu9wgdKzEUuDgg/bq UZTSjkDhZ4znhzzEJExfS6lKyWYgnWNS8snn5wcifGJxYrn2Qq3XnGohPvDoqrZIQXyvjcxkmqL aglzC5KFH729r95zSCmwCg1QcQb/1u5l0P1Yl5Dfh4b3867lQ8crRMgPbdIt68loMTCzzsG8vED HjqcnYiJ+84cGw4BPAHPZ4wx7HVLnwNy3qkAijje86EAktfeYnSofOCTlc7s8+tHzts1RvWBaa3 TsGO4QDBjJ+Wj14aZXr43rYzZ+fupGJeyf0PITxZoZY5NkjFqGmMe1umOVZuQZuo/GJC9hwds7z jFoHtOOlfWZBEk007a3/xV0Kt6smk3yr9TSlJ32bRurWpFKJRiQKQdJkttbmCK1+6rl5U+xYBfQ FEwR4s4vsoFBUdzRWn0CDcX/2vZ3qshm4TVtoTjEV3WunEPngnUCMpnIfZkcPeIWhgt4Y8i9zlP D/MoHcva98F0DUg== X-Mailer: b4 0.13.0 Message-ID: <20250115-vma-v12-8-375099ae017a@google.com> Subject: [PATCH v12 8/8] task: rust: rework how current is accessed From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Rspamd-Server: rspam05 X-Stat-Signature: puibhwo59reyxret6yq6nr1pp6jngnpx X-Rspamd-Queue-Id: B8FB5100005 X-Rspam-User: X-HE-Tag: 1736948183-331577 X-HE-Meta: U2FsdGVkX1/WENwCe5k0UEyC/XxC8mhAQyYjqszz5Ekviveq+eEo58+/ovPJgrV/VZkKjAUECrsFp8T0HuKhjjZ1ctUhK8gzIPCOM13dvH2zPZhT7z2MA9g+QMymV1QJaNSaFCUy2wm3Gm6Z9Y4BRiraMmCQH6F4Hq9q71Zc2pEFL0IiGHRGTRO1l9YaVuJDu3sIAx0ug45uk1TsZfNMFL8aoIwJBCg9qFj68uhLo9/WWp7rQ4wPZjVmJo+78Ora1iCeZqXzuFVJIcRpBBmHkpVhdyLU3WeKseRvMNpgcT2vde8mlYSviHMv9dEWROJ3rnAPCkU5qxKsLaKEzx6vHwMGM986nVouALkNtPZOI2hF7A/fKIogcSsHg2fAQPeynYa9ewtJzlWoTWPVc3UAc7GGB3bbE8Xf6DbSUWgyIqpQjzXgyfkD1Tzy854wLUJ1qI9jJ/fn3vIWzfY+Nj0LqsB950lhyFM73mpWkz9kbDV3X2r46e4SOHXizMiCKc5VI6BFzF3JD6etZ5ae3iyf/Afb5FhHNTxkkrSHr5Mumz2jw+3CzexLD7RK6xKqWz4L5WZb27qXSBT1AFKZMtyS32pBjXGdwFq2UBiIw1/QeddlxL6guR1Xk94jgq87ZkyPcp5FBv4n8KcueQ2gyuqtRlctv/mSBpdy76WCW5X4phsdTy5KX9IuICd8VAL4FC2PU+qS+a9FAoosWprpkXizheZ9+s1YWml6QEKXX3G2n7m8DbuD/i5m0wQ2OMuqioXT9ueqTY/Ssooz65LX5qspsEp9Liugqha8GzDA0Kr8WycUa7D/WO48LZib6b5zCEt8kQC2osnIPAc98dM8ROT9fWB+YVv3XpSl4fbF6ZBBvYEmShRjJiMnh0Uvg5+oaV0Uxu32gw8QpT6Ak2UMDvPgnoU74Uc2sfHCgG6qCHLrBKO0gsxe/O/Kw/ZmvWUNwNe26TMc8rBVGFEVCmKp7y3 /2NSx/SY xp2Avf/JkzTVpEbHiHojGjTcZgWw6UFSsIWsvBVBdWHtBWcZ159kBb7dtY88JFu0qmt+hekVVoDXTmRsk0EhQYJea9VvgezomycSg/wwfly1C4tDUs5DyrXxfXURWdTbUvHR9RQpDSwK0ZSbi7U4fKCcUrrBq9KUMbGX1LsAmkDGq4xbZcUNpyq4DOb2QwgwH1ElDeDulBLB+eCqdU7iDhOSZLrtzHDyh3WMUDfhOuc13ip6YunYaUDeP+u4jSTs6KTKiHfrdD2+v5p3muczyGQwsJOGi+fQoztM/66NizEfbVLc9CxT/qF9iYvvGLa1Glqta3/7YkHxfF+hafnP6jU47Idu8yGX5/WzKpw+wvUYPAr5Fm+ogwUNYILAKsfrU2yPBpVz5aPr95UT6YYXrpxb1VJMxcaDf9kV/UKYl3PlvcLcs2vZ2RnetY8OtXXzm4OwSR5mXUVWUYH6LU2qXzKY2rUeXhf2zthdp+wrpZEPRgYXQOy6GhT0YHd2M8brykzwUhv/SvVEBwLownqiU/BoLtTEhFzytPYvi/nuNxi04UVa/Op+KG+cafN+sTWQc9dy7gtShRE6rf3PpZU63qOt1+vsoYSVxGgOiA2x6PBQi8UjNuEp0vMao6/MbJeCu2nD5BXGWX+xHTc01OfjMUmnUCwpSU1KZPK2n X-Bogosity: Ham, tests=bogofilter, spamicity=0.436607, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Introduce a new type called `CurrentTask` that lets you perform various operations that are only safe on the `current` task. Use the new type to provide a way to access the current mm without incrementing its refcount. With this change, you can write stuff such as let vma = current!().mm().lock_vma_under_rcu(addr); without incrementing any refcounts. This replaces the existing abstractions for accessing the current pid namespace. With the old approach, every field access to current involves both a macro and a unsafe helper function. The new approach simplifies that to a single safe function on the `CurrentTask` type. This makes it less heavy-weight to add additional current accessors in the future. That said, creating a `CurrentTask` type like the one in this patch requires that we are careful to ensure that it cannot escape the current task or otherwise access things after they are freed. To do this, I declared that it cannot escape the current "task context" where I defined a "task context" as essentially the region in which `current` remains unchanged. So e.g., release_task() or begin_new_exec() would leave the task context. If a userspace thread returns to userspace and later makes another syscall, then I consider the two syscalls to be different task contexts. This allows values stored in that task to be modified between syscalls, even if they're guaranteed to be immutable during a syscall. Ensuring correctness of `CurrentTask` is slightly tricky if we also want the ability to have a safe `kthread_use_mm()` implementation in Rust. To support that safely, there are two patterns we need to ensure are safe: // Case 1: current!() called inside the scope. let mm; kthread_use_mm(some_mm, || { mm = current!().mm(); }); drop(some_mm); mm.do_something(); // UAF and: // Case 2: current!() called before the scope. let mm; let task = current!(); kthread_use_mm(some_mm, || { mm = task.mm(); }); drop(some_mm); mm.do_something(); // UAF The existing `current!()` abstraction already natively prevents the first case: The `&CurrentTask` would be tied to the inner scope, so the borrow-checker ensures that no reference derived from it can escape the scope. Fixing the second case is a bit more tricky. The solution is to essentially pretend that the contents of the scope execute on an different thread, which means that only thread-safe types can cross the boundary. Since `CurrentTask` is marked `NotThreadSafe`, attempts to move it to another thread will fail, and this includes our fake pretend thread boundary. This has the disadvantage that other types that aren't thread-safe for reasons unrelated to `current` also cannot be moved across the `kthread_use_mm()` boundary. I consider this an acceptable tradeoff. Reviewed-by: Boqun Feng Signed-off-by: Alice Ryhl --- rust/kernel/task.rs | 247 +++++++++++++++++++++++++++------------------------- 1 file changed, 129 insertions(+), 118 deletions(-) diff --git a/rust/kernel/task.rs b/rust/kernel/task.rs index 07bc22a7645c..0b6cb9a83a2e 100644 --- a/rust/kernel/task.rs +++ b/rust/kernel/task.rs @@ -7,6 +7,7 @@ use crate::{ bindings, ffi::{c_int, c_long, c_uint}, + mm::MmWithUser, pid_namespace::PidNamespace, types::{ARef, NotThreadSafe, Opaque}, }; @@ -31,22 +32,20 @@ #[macro_export] macro_rules! current { () => { - // SAFETY: Deref + addr-of below create a temporary `TaskRef` that cannot outlive the - // caller. + // SAFETY: This expression creates a temporary value that is dropped at the end of the + // caller's scope. The following mechanisms ensure that the resulting `&CurrentTask` cannot + // leave current task context: + // + // * To return to userspace, the caller must leave the current scope. + // * Operations such as `begin_new_exec()` are necessarily unsafe and the caller of + // `begin_new_exec()` is responsible for safety. + // * Rust abstractions for things such as a `kthread_use_mm()` scope must require the + // closure to be `Send`, so the `NotThreadSafe` field of `CurrentTask` ensures that the + // `&CurrentTask` cannot cross the scope in either direction. unsafe { &*$crate::task::Task::current() } }; } -/// Returns the currently running task's pid namespace. -#[macro_export] -macro_rules! current_pid_ns { - () => { - // SAFETY: Deref + addr-of below create a temporary `PidNamespaceRef` that cannot outlive - // the caller. - unsafe { &*$crate::task::Task::current_pid_ns() } - }; -} - /// Wraps the kernel's `struct task_struct`. /// /// # Invariants @@ -85,7 +84,7 @@ macro_rules! current_pid_ns { /// impl State { /// fn new() -> Self { /// Self { -/// creator: current!().into(), +/// creator: ARef::from(&**current!()), /// index: 0, /// } /// } @@ -105,6 +104,44 @@ unsafe impl Send for Task {} // synchronised by C code (e.g., `signal_pending`). unsafe impl Sync for Task {} +/// Represents the [`Task`] in the `current` global. +/// +/// This type exists to provide more efficient operations that are only valid on the current task. +/// For example, to retrieve the pid-namespace of a task, you must use rcu protection unless it is +/// the current task. +/// +/// # Invariants +/// +/// Each value of this type must only be accessed from the task context it was created within. +/// +/// Of course, every thread is in a different task context, but for the purposes of this invariant, +/// these operations also permanently leave the task context: +/// +/// * Returning to userspace from system call context. +/// * Calling `release_task()`. +/// * Calling `begin_new_exec()` in a binary format loader. +/// +/// Other operations temporarily create a new sub-context: +/// +/// * Calling `kthread_use_mm()` creates a new context, and `kthread_unuse_mm()` returns to the +/// old context. +/// +/// This means that a `CurrentTask` obtained before a `kthread_use_mm()` call may be used again +/// once `kthread_unuse_mm()` is called, but it must not be used between these two calls. +/// Conversely, a `CurrentTask` obtained between a `kthread_use_mm()`/`kthread_unuse_mm()` pair +/// must not be used after `kthread_unuse_mm()`. +#[repr(transparent)] +pub struct CurrentTask(Task, NotThreadSafe); + +// Make all `Task` methods available on `CurrentTask`. +impl Deref for CurrentTask { + type Target = Task; + #[inline] + fn deref(&self) -> &Task { + &self.0 + } +} + /// The type of process identifiers (PIDs). type Pid = bindings::pid_t; @@ -131,119 +168,29 @@ pub fn current_raw() -> *mut bindings::task_struct { /// /// # Safety /// - /// Callers must ensure that the returned object doesn't outlive the current task/thread. - pub unsafe fn current() -> impl Deref { - struct TaskRef<'a> { - task: &'a Task, - _not_send: NotThreadSafe, + /// Callers must ensure that the returned object is only used to access a [`CurrentTask`] + /// within the task context that was active when this function was called. For more details, + /// see the invariants section for [`CurrentTask`]. + pub unsafe fn current() -> impl Deref { + struct TaskRef { + task: *const CurrentTask, } - impl Deref for TaskRef<'_> { - type Target = Task; + impl Deref for TaskRef { + type Target = CurrentTask; fn deref(&self) -> &Self::Target { - self.task + // SAFETY: The returned reference borrows from this `TaskRef`, so it cannot outlive + // the `TaskRef`, which the caller of `Task::current()` has promised will not + // outlive the task/thread for which `self.task` is the `current` pointer. Thus, it + // is okay to return a `CurrentTask` reference here. + unsafe { &*self.task } } } - let current = Task::current_raw(); TaskRef { - // SAFETY: If the current thread is still running, the current task is valid. Given - // that `TaskRef` is not `Send`, we know it cannot be transferred to another thread - // (where it could potentially outlive the caller). - task: unsafe { &*current.cast() }, - _not_send: NotThreadSafe, - } - } - - /// Returns a PidNamespace reference for the currently executing task's/thread's pid namespace. - /// - /// This function can be used to create an unbounded lifetime by e.g., storing the returned - /// PidNamespace in a global variable which would be a bug. So the recommended way to get the - /// current task's/thread's pid namespace is to use the [`current_pid_ns`] macro because it is - /// safe. - /// - /// # Safety - /// - /// Callers must ensure that the returned object doesn't outlive the current task/thread. - pub unsafe fn current_pid_ns() -> impl Deref { - struct PidNamespaceRef<'a> { - task: &'a PidNamespace, - _not_send: NotThreadSafe, - } - - impl Deref for PidNamespaceRef<'_> { - type Target = PidNamespace; - - fn deref(&self) -> &Self::Target { - self.task - } - } - - // The lifetime of `PidNamespace` is bound to `Task` and `struct pid`. - // - // The `PidNamespace` of a `Task` doesn't ever change once the `Task` is alive. A - // `unshare(CLONE_NEWPID)` or `setns(fd_pidns/pidfd, CLONE_NEWPID)` will not have an effect - // on the calling `Task`'s pid namespace. It will only effect the pid namespace of children - // created by the calling `Task`. This invariant guarantees that after having acquired a - // reference to a `Task`'s pid namespace it will remain unchanged. - // - // When a task has exited and been reaped `release_task()` will be called. This will set - // the `PidNamespace` of the task to `NULL`. So retrieving the `PidNamespace` of a task - // that is dead will return `NULL`. Note, that neither holding the RCU lock nor holding a - // referencing count to - // the `Task` will prevent `release_task()` being called. - // - // In order to retrieve the `PidNamespace` of a `Task` the `task_active_pid_ns()` function - // can be used. There are two cases to consider: - // - // (1) retrieving the `PidNamespace` of the `current` task - // (2) retrieving the `PidNamespace` of a non-`current` task - // - // From system call context retrieving the `PidNamespace` for case (1) is always safe and - // requires neither RCU locking nor a reference count to be held. Retrieving the - // `PidNamespace` after `release_task()` for current will return `NULL` but no codepath - // like that is exposed to Rust. - // - // Retrieving the `PidNamespace` from system call context for (2) requires RCU protection. - // Accessing `PidNamespace` outside of RCU protection requires a reference count that - // must've been acquired while holding the RCU lock. Note that accessing a non-`current` - // task means `NULL` can be returned as the non-`current` task could have already passed - // through `release_task()`. - // - // To retrieve (1) the `current_pid_ns!()` macro should be used which ensure that the - // returned `PidNamespace` cannot outlive the calling scope. The associated - // `current_pid_ns()` function should not be called directly as it could be abused to - // created an unbounded lifetime for `PidNamespace`. The `current_pid_ns!()` macro allows - // Rust to handle the common case of accessing `current`'s `PidNamespace` without RCU - // protection and without having to acquire a reference count. - // - // For (2) the `task_get_pid_ns()` method must be used. This will always acquire a - // reference on `PidNamespace` and will return an `Option` to force the caller to - // explicitly handle the case where `PidNamespace` is `None`, something that tends to be - // forgotten when doing the equivalent operation in `C`. Missing RCU primitives make it - // difficult to perform operations that are otherwise safe without holding a reference - // count as long as RCU protection is guaranteed. But it is not important currently. But we - // do want it in the future. - // - // Note for (2) the required RCU protection around calling `task_active_pid_ns()` - // synchronizes against putting the last reference of the associated `struct pid` of - // `task->thread_pid`. The `struct pid` stored in that field is used to retrieve the - // `PidNamespace` of the caller. When `release_task()` is called `task->thread_pid` will be - // `NULL`ed and `put_pid()` on said `struct pid` will be delayed in `free_pid()` via - // `call_rcu()` allowing everyone with an RCU protected access to the `struct pid` acquired - // from `task->thread_pid` to finish. - // - // SAFETY: The current task's pid namespace is valid as long as the current task is running. - let pidns = unsafe { bindings::task_active_pid_ns(Task::current_raw()) }; - PidNamespaceRef { - // SAFETY: If the current thread is still running, the current task and its associated - // pid namespace are valid. `PidNamespaceRef` is not `Send`, so we know it cannot be - // transferred to another thread (where it could potentially outlive the current - // `Task`). The caller needs to ensure that the PidNamespaceRef doesn't outlive the - // current task/thread. - task: unsafe { PidNamespace::from_ptr(pidns) }, - _not_send: NotThreadSafe, + // CAST: The layout of `struct task_struct` and `CurrentTask` is identical. + task: Task::current_raw().cast(), } } @@ -326,6 +273,70 @@ pub fn wake_up(&self) { } } +impl CurrentTask { + /// Access the address space of the current task. + /// + /// This function does not touch the refcount of the mm. + #[inline] + pub fn mm(&self) -> Option<&MmWithUser> { + // SAFETY: The `mm` field of `current` is not modified from other threads, so reading it is + // not a data race. + let mm = unsafe { (*self.as_ptr()).mm }; + + if mm.is_null() { + return None; + } + + // SAFETY: If `current->mm` is non-null, then it references a valid mm with a non-zero + // value of `mm_users`. Furthermore, the returned `&MmWithUser` borrows from this + // `CurrentTask`, so it cannot escape the scope in which the current pointer was obtained. + // + // This is safe even if `kthread_use_mm()`/`kthread_unuse_mm()` are used. There are two + // relevant cases: + // * If the `&CurrentTask` was created before `kthread_use_mm()`, then it cannot be + // accessed during the `kthread_use_mm()`/`kthread_unuse_mm()` scope due to the + // `NotThreadSafe` field of `CurrentTask`. + // * If the `&CurrentTask` was created within a `kthread_use_mm()`/`kthread_unuse_mm()` + // scope, then the `&CurrentTask` cannot escape that scope, so the returned `&MmWithUser` + // also cannot escape that scope. + // In either case, it's not possible to read `current->mm` and keep using it after the + // scope is ended with `kthread_unuse_mm()`. + Some(unsafe { MmWithUser::from_raw(mm) }) + } + + /// Access the pid namespace of the current task. + /// + /// This function does not touch the refcount of the namespace or use RCU protection. + /// + /// To access the pid namespace of another task, see [`Task::get_pid_ns`]. + #[doc(alias = "task_active_pid_ns")] + #[inline] + pub fn active_pid_ns(&self) -> Option<&PidNamespace> { + // SAFETY: It is safe to call `task_active_pid_ns` without RCU protection when calling it + // on the current task. + let active_ns = unsafe { bindings::task_active_pid_ns(self.as_ptr()) }; + + if active_ns.is_null() { + return None; + } + + // The lifetime of `PidNamespace` is bound to `Task` and `struct pid`. + // + // The `PidNamespace` of a `Task` doesn't ever change once the `Task` is alive. + // + // From system call context retrieving the `PidNamespace` for the current task is always + // safe and requires neither RCU locking nor a reference count to be held. Retrieving the + // `PidNamespace` after `release_task()` for current will return `NULL` but no codepath + // like that is exposed to Rust. + // + // SAFETY: If `current`'s pid ns is non-null, then it references a valid pid ns. + // Furthermore, the returned `&PidNamespace` borrows from this `CurrentTask`, so it cannot + // escape the scope in which the current pointer was obtained, e.g. it cannot live past a + // `release_task()` call. + Some(unsafe { PidNamespace::from_ptr(active_ns) }) + } +} + // SAFETY: The type invariants guarantee that `Task` is always refcounted. unsafe impl crate::types::AlwaysRefCounted for Task { fn inc_ref(&self) {