From patchwork Mon Feb 3 12:14:36 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13957351 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE83DC02196 for ; Mon, 3 Feb 2025 12:15:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 44DF0280005; Mon, 3 Feb 2025 07:15:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4243E280001; Mon, 3 Feb 2025 07:15:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 29DFE280005; Mon, 3 Feb 2025 07:15:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id EB9EB280001 for ; Mon, 3 Feb 2025 07:15:27 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id AC2AEC291D for ; Mon, 3 Feb 2025 12:15:17 +0000 (UTC) X-FDA: 83078528274.07.CBFF7EB Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf08.hostedemail.com (Postfix) with ESMTP id AAEB216000E for ; Mon, 3 Feb 2025 12:15:15 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=BL5ODJ42; spf=pass (imf08.hostedemail.com: domain of 3UrOgZwkKCH8dolfhu1kojrrjoh.frpolqx0-ppnydfn.ruj@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3UrOgZwkKCH8dolfhu1kojrrjoh.frpolqx0-ppnydfn.ruj@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738584915; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wQaS28cG6yW+ftNKFxNZP6eegjrIvFbvaYv5chLHbzg=; b=mBZGf4kd03DEFD+A+I/qbTdiEarN15y87ZBwJLjQyJjd/qcc4uF5IDJg5xC+TpvjjchKqL aW9dnAmx2XBDDaqkjn11DCUycQHN+v5w6vuX1YElLs6XWa41VFcPGXTfqPLpSzu9b8S9RJ zY2aL1oLbNRZ/6EpycK6sgtx1qwAVzs= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=BL5ODJ42; spf=pass (imf08.hostedemail.com: domain of 3UrOgZwkKCH8dolfhu1kojrrjoh.frpolqx0-ppnydfn.ruj@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3UrOgZwkKCH8dolfhu1kojrrjoh.frpolqx0-ppnydfn.ruj@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738584915; a=rsa-sha256; cv=none; b=4psTOg2Ep0VjA51JKXNn9eaWdy4bbnK8CjxfX1RDV3kr6UcWH8GQR2SuCbTJjA43hI4dA3 HRoeYND24awLjXdXz5gT8vQHqkh4F5y8i0/zdHzyaEjrh8D+QR+h6SE8/OMXH7BXy8c5K2 iYScP9APk35pibICwlHA/R0gRgF/Fgo= Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43625ceae52so22730125e9.0 for ; Mon, 03 Feb 2025 04:15:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738584914; x=1739189714; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=wQaS28cG6yW+ftNKFxNZP6eegjrIvFbvaYv5chLHbzg=; b=BL5ODJ42hypts5C8DSpNsxTfhIGM70dvqb3b14aRCEsW571g9mKVBj14yHspa0piFT jWQtKRVxRkon4ANKDFwkgy5EWLqt7xqtxvZxeEYxQTZx/cevcaMVFlDW8DAKePMiTSp2 aHiLbOmwYhDmSN/ZdVfLqTxGl5LBXNrrBahwUPYVex7B8UEpDztBMBeVfW/1Hqe5EZSZ 2dsUYSKYS7qtOfXYFlxOGePvPDZUmecihUoxby1LIN+OrPAVv8ajFmZFTiVOoat6+Axh S1Z27Vb5sR0CvL7EDJcirNBQsqgZEyPyOUTSp3IeNGdfWw/+G9VPPNDwJCkDLSwjyK49 dfGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738584914; x=1739189714; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wQaS28cG6yW+ftNKFxNZP6eegjrIvFbvaYv5chLHbzg=; b=G7n55stzdTeJuhGFIEA5PNZmHzxRp6CJGNx4UPUvJK+lv+ckA1/Eh3qvBM85J8fJnt DvjUF3puavb5dnEDZzxSH8tVQEuGL87BdXzP3Fq9FCceAAp9mRN1ZmggiDdNvwFMgXgd Y3c6Sgud0FnceOvKmI3Fl6lO3N57c1EwjwdpPNLZ3AjyS3otDLO8Ua4gnCd/JU+WaCPE khVVoneIBItUNWbcW/+7dnvOpOoPi3OpwpIy8T7or5OoRxAHggXfvak+TzC+OaAiWp7K q5x60NE1BB92UAbFJbQXKq/PUt501KD2NnByM8oQlS/1oUlYCYK7Dp3lYqSuKLiyoS4j 1pTw== X-Forwarded-Encrypted: i=1; AJvYcCVb71KSQ0NFpwm5HNmM+/5iZETVx8e4e4TiLq9Zywu89JF+HACAO7CechbhidotuFUrufWn5StQDg==@kvack.org X-Gm-Message-State: AOJu0YxJchnIRlIM9bqcze9nLtXdPWqLD7bS0+cjEdGHhzvhnWzXHEGB x9XES/ZNCsd/cbwW3ycE31pdsLEgB1X2KT7TWxjWzBPpilbpnPawy5uTeuOLWH8DavRzEvyHU8t Y+cGsguVXBe+GzA== X-Google-Smtp-Source: AGHT+IGxJxBbT2D0jUH7PkoghvafmPAMvx3j15vJoonAlLH+FDpuc6DZ1HXcdNaeLgjK2m0GeHgRrPy+nOzkvHI= X-Received: from wmbbh25.prod.google.com ([2002:a05:600c:3d19:b0:436:e723:5be6]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:a089:b0:431:547e:81d0 with SMTP id 5b1f17b1804b1-438dc3bac30mr240516145e9.11.1738584914324; Mon, 03 Feb 2025 04:15:14 -0800 (PST) Date: Mon, 03 Feb 2025 12:14:36 +0000 In-Reply-To: <20250203-vma-v13-0-2b998268a396@google.com> Mime-Version: 1.0 References: <20250203-vma-v13-0-2b998268a396@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=10247; i=aliceryhl@google.com; h=from:subject:message-id; bh=SmRa0acQEuEADan3GdJe/hIPrJuPHtTw8z6DhPDiFqM=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnoLNJtc+O9U//+xzFxHIwPIhQUQRbRxX/4wZxI dLTxDCumKyJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ6CzSQAKCRAEWL7uWMY5 RocAEACJm1PYPEHpHiykEjJyFzPL4gv8/tCMYFkHQSopUImzpSVQlc7CE2mwY9GKpP8he2AbVHb wdGSI6KeAhnFzA4nPPBUG8c/VBuONn5+aS3fJY9TT4m6QgzccvQm+UuOx/rfiVdbSxPhJJ8u5Aq 9Dd/rpp9maEV2yx59n6I5tuf3jpvdLGkSQDTRghtBwhX00tMzU7PoCdlvYG39FpDjmfrAOU3JJe /Lxy+FoJvw+779h9exMd3sITMAp4X+ze3c7a9arfaro9GMqoFM5gxfx+lbKlKg+6lyzsFReMkCB gJwPMD0saRr4ndazh3/Yq3GbRceVIJkzpLhTYSoBbjmpMcpGLm3plmStKN09A0lJfosOxaMu26a HdyMEmRAcSXmfsZZnJf2DEv17YijbG4injaPUlqua6VYJydhNMeU50NbziZKK9po9oSBp84z9MR wgkjzCRsHrQX7A0OQh9Y50A5oQi4ICfAIqIEEPYfjO8esKtcH03cYVMwFEI8nzMKlsO/8yG2WkL MzBsyTuZfSPrQOc2eOz7OF6A2/0cvJwNFqXOrj2Zu34d8AKTGsPkAjg600u61nOEo6Co8SZQyop Wh8ybfPGSzR1QN6bFbnJlbnaot1NvB/ZPxTTiJYV0sX9Bzx3nb9KFyHgE5A4ocDkTta9pyy02ET 87mcDAXq6UUwpsw== X-Mailer: b4 0.13.0 Message-ID: <20250203-vma-v13-1-2b998268a396@google.com> Subject: [PATCH v13 1/8] mm: rust: add abstraction for struct mm_struct From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl , Balbir Singh X-Rspamd-Queue-Id: AAEB216000E X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 4pi6cmqd1dzawbc3x3gaae8tqnse6xhr X-HE-Tag: 1738584915-918903 X-HE-Meta: U2FsdGVkX191q7FwVSYIbOaDYgJUHm6UUYEJFVY/Wu8RPzWSwquviTlcMAu5HQUgbTF4TAuFMJvpYPyPTqBBqIGAuR9cLpAh8piH72Du3Y9KNYw4i82PkMQ0FpKYOqPdzL5+PT79GZMjyRiODYGmF1cnI9gUIyreg3tIYcPbuqvA4j13AiL2yMbomjipNTpLcGU680UyJrLA5Z0at3Ipsm/O1niEKimh0l0CuFi80Ahv7dvVv/hEz07zM1xBhiHIxakquxQWg14oQeV4IhTrD3+9bqd3HJQLXGIXwQFfJwb9GTrh64jdn0vOZkZ3gM0KrvR/4mejvc8EhyaowF/A7+O/F94N4oyKlDSB3J69AZOUmHenCVkyd6HtO5H3CJ+gBys+K7ZV8VAFZsxPBtKPe9u6cfU3TYOnShlOmHgFRXbF0klCPgerCZ/NGFlqT+Vue1eBmd1hwbRdxl3hVHfxH1UCiUIJ0g0DqQepcgd1LyitYn3Y71BA/vUHQ9DzQdg0LoF9Sx/SIg3QTu7AOQrQVy1iyZFslqJuU5Et71Uxi7ARVAVY0l8IKodb1CHyl4iasQ7cKX9RPVn9XrNoJ8owA9BQX8SkTc46VYHg845oYh7UChyo+D2L8Xrejb/X6tRwuwS/C0gcJQaEL9id5bW4apkarpZI6N4Ntv8hPG2fMJcerVWK8KvUAiuvjmbXPbEDat2Jj0IuXcyLEcX7Sas1uqWOXL/rDPgIEkDoRGx3Ordwknu0G0e5nEMxhqvpJJf8U66vCkjkxiru3zxuAxDdWWCWzMEm6so+znr9KLcDOnvHyV+ZbXOMgOlIEopzarxcHxs53htDGKwWwxUGE4ZFvFqWdDJ2Dn4owoELHGrFhw8HBg/GrkpoB6GM1xbEvwvxmPANG/RzXx15npYbjE7MKqTThcqF/24OHSXJeeh042RttTTaY2kEZUmYDSW48dGJ6IIVmbnc1K4lj8R4kva TtgDMpzy 6IrF9r9tNRZ4Y7r60onYuTpUrB+T7Jh83L0WSHcUlkWwL2WtLq1rmoP0y2LLc6ex6+4idSIVbB/A4d+C5gxqUUx7IgdWM5Lqh+OY5DG7o5+BFrEb/9v6tBCprj2zlRvKgyOQtz75VY20frIIm0aAV7HlC9mzmM+FGOUfOhJ21gaLUDi8ZuRL0dpJtp6agKeiiL9/2afAorS+ihSd0gGuXpmRTgLZm+0WCCXylP6p7ncXe6+QgJUIJH17dnFq7ObvK+PEynDJTGDLHICcYVXTflX80rzmDjO06HE5E6kX7GzfcLMTAigwr39SBsIr+qm6X1aVTq9SyzayryDAz5rp3xv3PHZbxf2ETh5eQLGLYa9zMorb5Y2ogXEbf/qSdIR5u8zTNhK0vbytTe4NUddhBQdkXJthK+KiNGooi5F0h3rThfhJYVVhyH/1KdaM4qDk0WYjWJB4JBrZwmhpaishVGm4sIQVz6mKNJzEFyXvs7wMeig6No9+PSJpmoNK8Iy+I3/CjtRuzqVuqj8OrQbTLIjz2BxJimSF6Cqb0wJ/e+LYgLLr3mgkUg4pln4H1aPTva/AZWyWL2IvxWC2kqUiz8ZUeMBaw4iHijvslI7w4w9KqC5l76trKdpBRvbb1I0U85DX9UADjyQ/PR0kxKpkqOYaZHDxpyMODtNpd/M0OwQj2cQoy7+77ZgjnoDKN7WZa+bRApRk0YklNAI4CItRL6bxhdeV5ZPs2VZjqyvZ89h8HjHBQZzrb6+b3Fp0PiXCGaQoa X-Bogosity: Ham, tests=bogofilter, spamicity=0.408146, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: These abstractions allow you to reference a `struct mm_struct` using both mmgrab and mmget refcounts. This is done using two Rust types: * Mm - represents an mm_struct where you don't know anything about the value of mm_users. * MmWithUser - represents an mm_struct where you know at compile time that mm_users is non-zero. This allows us to encode in the type system whether a method requires that mm_users is non-zero or not. For instance, you can always call `mmget_not_zero` but you can only call `mmap_read_lock` when mm_users is non-zero. The struct is called Mm to keep consistency with the C side. The ability to obtain `current->mm` is added later in this series. Acked-by: Lorenzo Stoakes Acked-by: Balbir Singh Reviewed-by: Andreas Hindborg Signed-off-by: Alice Ryhl --- rust/helpers/helpers.c | 1 + rust/helpers/mm.c | 39 +++++++++ rust/kernel/lib.rs | 1 + rust/kernel/mm.rs | 209 +++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 250 insertions(+) diff --git a/rust/helpers/helpers.c b/rust/helpers/helpers.c index 0640b7e115be..97cfc2d29f5e 100644 --- a/rust/helpers/helpers.c +++ b/rust/helpers/helpers.c @@ -18,6 +18,7 @@ #include "io.c" #include "jump_label.c" #include "kunit.c" +#include "mm.c" #include "mutex.c" #include "page.c" #include "platform.c" diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c new file mode 100644 index 000000000000..7201747a5d31 --- /dev/null +++ b/rust/helpers/mm.c @@ -0,0 +1,39 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include + +void rust_helper_mmgrab(struct mm_struct *mm) +{ + mmgrab(mm); +} + +void rust_helper_mmdrop(struct mm_struct *mm) +{ + mmdrop(mm); +} + +void rust_helper_mmget(struct mm_struct *mm) +{ + mmget(mm); +} + +bool rust_helper_mmget_not_zero(struct mm_struct *mm) +{ + return mmget_not_zero(mm); +} + +void rust_helper_mmap_read_lock(struct mm_struct *mm) +{ + mmap_read_lock(mm); +} + +bool rust_helper_mmap_read_trylock(struct mm_struct *mm) +{ + return mmap_read_trylock(mm); +} + +void rust_helper_mmap_read_unlock(struct mm_struct *mm) +{ + mmap_read_unlock(mm); +} diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs index 496ed32b0911..9cf35fbff356 100644 --- a/rust/kernel/lib.rs +++ b/rust/kernel/lib.rs @@ -57,6 +57,7 @@ pub mod kunit; pub mod list; pub mod miscdevice; +pub mod mm; #[cfg(CONFIG_NET)] pub mod net; pub mod of; diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs new file mode 100644 index 000000000000..2fb5f440af60 --- /dev/null +++ b/rust/kernel/mm.rs @@ -0,0 +1,209 @@ +// SPDX-License-Identifier: GPL-2.0 + +// Copyright (C) 2024 Google LLC. + +//! Memory management. +//! +//! This module deals with managing the address space of userspace processes. Each process has an +//! instance of [`Mm`], which keeps track of multiple VMAs (virtual memory areas). Each VMA +//! corresponds to a region of memory that the userspace process can access, and the VMA lets you +//! control what happens when userspace reads or writes to that region of memory. +//! +//! C header: [`include/linux/mm.h`](srctree/include/linux/mm.h) + +use crate::{ + bindings, + types::{ARef, AlwaysRefCounted, NotThreadSafe, Opaque}, +}; +use core::{ops::Deref, ptr::NonNull}; + +/// A wrapper for the kernel's `struct mm_struct`. +/// +/// This represents the address space of a userspace process, so each process has one `Mm` +/// instance. It may hold many VMAs internally. +/// +/// There is a counter called `mm_users` that counts the users of the address space; this includes +/// the userspace process itself, but can also include kernel threads accessing the address space. +/// Once `mm_users` reaches zero, this indicates that the address space can be destroyed. To access +/// the address space, you must prevent `mm_users` from reaching zero while you are accessing it. +/// The [`MmWithUser`] type represents an address space where this is guaranteed, and you can +/// create one using [`mmget_not_zero`]. +/// +/// The `ARef` smart pointer holds an `mmgrab` refcount. Its destructor may sleep. +/// +/// # Invariants +/// +/// Values of this type are always refcounted using `mmgrab`. +/// +/// [`mmget_not_zero`]: Mm::mmget_not_zero +#[repr(transparent)] +pub struct Mm { + mm: Opaque, +} + +// SAFETY: It is safe to call `mmdrop` on another thread than where `mmgrab` was called. +unsafe impl Send for Mm {} +// SAFETY: All methods on `Mm` can be called in parallel from several threads. +unsafe impl Sync for Mm {} + +// SAFETY: By the type invariants, this type is always refcounted. +unsafe impl AlwaysRefCounted for Mm { + #[inline] + fn inc_ref(&self) { + // SAFETY: The pointer is valid since self is a reference. + unsafe { bindings::mmgrab(self.as_raw()) }; + } + + #[inline] + unsafe fn dec_ref(obj: NonNull) { + // SAFETY: The caller is giving up their refcount. + unsafe { bindings::mmdrop(obj.cast().as_ptr()) }; + } +} + +/// A wrapper for the kernel's `struct mm_struct`. +/// +/// This type is like [`Mm`], but with non-zero `mm_users`. It can only be used when `mm_users` can +/// be proven to be non-zero at compile-time, usually because the relevant code holds an `mmget` +/// refcount. It can be used to access the associated address space. +/// +/// The `ARef` smart pointer holds an `mmget` refcount. Its destructor may sleep. +/// +/// # Invariants +/// +/// Values of this type are always refcounted using `mmget`. The value of `mm_users` is non-zero. +#[repr(transparent)] +pub struct MmWithUser { + mm: Mm, +} + +// SAFETY: It is safe to call `mmput` on another thread than where `mmget` was called. +unsafe impl Send for MmWithUser {} +// SAFETY: All methods on `MmWithUser` can be called in parallel from several threads. +unsafe impl Sync for MmWithUser {} + +// SAFETY: By the type invariants, this type is always refcounted. +unsafe impl AlwaysRefCounted for MmWithUser { + #[inline] + fn inc_ref(&self) { + // SAFETY: The pointer is valid since self is a reference. + unsafe { bindings::mmget(self.as_raw()) }; + } + + #[inline] + unsafe fn dec_ref(obj: NonNull) { + // SAFETY: The caller is giving up their refcount. + unsafe { bindings::mmput(obj.cast().as_ptr()) }; + } +} + +// Make all `Mm` methods available on `MmWithUser`. +impl Deref for MmWithUser { + type Target = Mm; + + #[inline] + fn deref(&self) -> &Mm { + &self.mm + } +} + +// These methods are safe to call even if `mm_users` is zero. +impl Mm { + /// Returns a raw pointer to the inner `mm_struct`. + #[inline] + pub fn as_raw(&self) -> *mut bindings::mm_struct { + self.mm.get() + } + + /// Obtain a reference from a raw pointer. + /// + /// # Safety + /// + /// The caller must ensure that `ptr` points at an `mm_struct`, and that it is not deallocated + /// during the lifetime 'a. + #[inline] + pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a Mm { + // SAFETY: Caller promises that the pointer is valid for 'a. Layouts are compatible due to + // repr(transparent). + unsafe { &*ptr.cast() } + } + + /// Calls `mmget_not_zero` and returns a handle if it succeeds. + #[inline] + pub fn mmget_not_zero(&self) -> Option> { + // SAFETY: The pointer is valid since self is a reference. + let success = unsafe { bindings::mmget_not_zero(self.as_raw()) }; + + if success { + // SAFETY: We just created an `mmget` refcount. + Some(unsafe { ARef::from_raw(NonNull::new_unchecked(self.as_raw().cast())) }) + } else { + None + } + } +} + +// These methods require `mm_users` to be non-zero. +impl MmWithUser { + /// Obtain a reference from a raw pointer. + /// + /// # Safety + /// + /// The caller must ensure that `ptr` points at an `mm_struct`, and that `mm_users` remains + /// non-zero for the duration of the lifetime 'a. + #[inline] + pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser { + // SAFETY: Caller promises that the pointer is valid for 'a. The layout is compatible due + // to repr(transparent). + unsafe { &*ptr.cast() } + } + + /// Lock the mmap read lock. + #[inline] + pub fn mmap_read_lock(&self) -> MmapReadGuard<'_> { + // SAFETY: The pointer is valid since self is a reference. + unsafe { bindings::mmap_read_lock(self.as_raw()) }; + + // INVARIANT: We just acquired the read lock. + MmapReadGuard { + mm: self, + _nts: NotThreadSafe, + } + } + + /// Try to lock the mmap read lock. + #[inline] + pub fn mmap_read_trylock(&self) -> Option> { + // SAFETY: The pointer is valid since self is a reference. + let success = unsafe { bindings::mmap_read_trylock(self.as_raw()) }; + + if success { + // INVARIANT: We just acquired the read lock. + Some(MmapReadGuard { + mm: self, + _nts: NotThreadSafe, + }) + } else { + None + } + } +} + +/// A guard for the mmap read lock. +/// +/// # Invariants +/// +/// This `MmapReadGuard` guard owns the mmap read lock. +pub struct MmapReadGuard<'a> { + mm: &'a MmWithUser, + // `mmap_read_lock` and `mmap_read_unlock` must be called on the same thread + _nts: NotThreadSafe, +} + +impl Drop for MmapReadGuard<'_> { + #[inline] + fn drop(&mut self) { + // SAFETY: We hold the read lock by the type invariants. + unsafe { bindings::mmap_read_unlock(self.mm.as_raw()) }; + } +} From patchwork Mon Feb 3 12:14:37 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13957365 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67989C02192 for ; Mon, 3 Feb 2025 12:21:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EFFC0280012; Mon, 3 Feb 2025 07:21:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E895928000F; Mon, 3 Feb 2025 07:21:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CDB58280012; Mon, 3 Feb 2025 07:21:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A814528000F for ; Mon, 3 Feb 2025 07:21:23 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 0A976162DBA for ; Mon, 3 Feb 2025 12:15:20 +0000 (UTC) X-FDA: 83078528400.08.C2DF3F9 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) by imf03.hostedemail.com (Postfix) with ESMTP id 0066120002 for ; Mon, 3 Feb 2025 12:15:17 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XHl2cy8z; spf=pass (imf03.hostedemail.com: domain of 3VLOgZwkKCIEfqnhjw3mqlttlqj.htrqnsz2-rrp0fhp.twl@flex--aliceryhl.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=3VLOgZwkKCIEfqnhjw3mqlttlqj.htrqnsz2-rrp0fhp.twl@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738584918; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HsKBb44x7fv9WNumP7fvjTBcr8RrxO1zg/m2ciN2/Jk=; b=xUC7KDXKvQKgDL29RAJW3nMKGoC9lHcVsXk9T/ef8SNVUdRshdFekMnq28101Vpjvrv4gR 2RUbF3kIzf7UYlnQXw2wN2NCrfzPV9F9rHjFzst7dRLs+KxybeErDMJauaEyAvfmBbKmeC gLM96CsCSyI/VSttP8kSyXY6GoA3K1U= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XHl2cy8z; spf=pass (imf03.hostedemail.com: domain of 3VLOgZwkKCIEfqnhjw3mqlttlqj.htrqnsz2-rrp0fhp.twl@flex--aliceryhl.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=3VLOgZwkKCIEfqnhjw3mqlttlqj.htrqnsz2-rrp0fhp.twl@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738584918; a=rsa-sha256; cv=none; b=M+hmT7FfqTHi1V/CCyXOyW1zwDy0VFybSTV4l4OmAFVGISIBoqdx+V3RUe0Z9xQIj4VPce y0mwMPW2ONP0b7GoZXe/u0KBtKqekPTcBytE7HwTgeutE1feC2LD0BkQGFKkLZ+zybiziu TqPnrsF0XVsq0yA4JS9XzXtBz4QdeT8= Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-38c5d9530ebso2468581f8f.1 for ; Mon, 03 Feb 2025 04:15:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738584916; x=1739189716; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=HsKBb44x7fv9WNumP7fvjTBcr8RrxO1zg/m2ciN2/Jk=; b=XHl2cy8zfpbGyXRB4Hu33fR1+0pYsHT4wODf+3YsXoTcOatUoYcbbkozpVgFiOrSbQ 4VuIfTQqa8irIq416eGBT6Y9ZgwPS7uusJCfXhxqUS/dKJYKdS/oRQ7xPdf4nOOC+qlh KWQtEaFiyL1u8G6SNAGVGYGVjjc65XzE+3DuLHCSFEo4II613bz4CFUY55UIiIq4lkES OFQNgAoEl4xgRa10B9qugJqUpD7w+iRxUldapPuatuH0wW5KGFl/UB3OVBBSuEyy0Cgz tSLUmF3e/Dgm49m8+Z8HW4Q55EOmADY1jzNbMSBzacn7CB1v6EW4zRHDZ3jBT53pW3zl mGTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738584916; x=1739189716; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=HsKBb44x7fv9WNumP7fvjTBcr8RrxO1zg/m2ciN2/Jk=; b=ZtW5xkwOD7njJ/TOUFD2n7dD7DbEGkubFDFUgfF8d8gAz6YllB2rp7s6QPthItmfi6 MU9XYvELUF/9XEImoQh6CQ/vrCHS+ilqJY/Pr56oRczd1WrNNTcSrsrET6ycPQnBB1Ci h93tkUY1KvkRZLt8/FBZsGM/eY2r62lQvFLqucHh4wKsVKfBDqYttwowz+alvVz6hDLj 2JDTOJ8e9Rzzwx49CbT56iYEjSXLFA1VzhJY9shCfRyIc0nIByubjphZ4e3b/2aUKbii zRAnhynkYM5oS5Lv2gE5N+YsjJC/M/BDcNGScbL//hfiXDHJMngYWCdzb9WuH+/G3VRU q+Xw== X-Forwarded-Encrypted: i=1; AJvYcCVlcXFFdEnwDIinIDLHGL0BfmP9SU2vr2c2gnh3ZeHPruGJFrfnedZtUap1+X1/yJN4ohlK0+VMDg==@kvack.org X-Gm-Message-State: AOJu0YzeZcOuiwQDIN9P1APoOGsXpNkg70i/SvoZmmho65xlasqKWYxZ +faA1pxy46JSOEMYyw/ASFv/JJxUxXIpl0EyJ78RTrsBoisJ7yMFrBhgcg2TQ9O1OrCS56BuYI7 ubJJ1978Qholkzg== X-Google-Smtp-Source: AGHT+IGOresZtCVdJKJ16Ys67mlWuwAfUZhBuPWvVIUcEJB1FfKeHTp8NIO1djUpkZoFwaOzxSKnCdpKbygHPEE= X-Received: from wmbbd13.prod.google.com ([2002:a05:600c:1f0d:b0:436:d819:e4eb]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a5d:64e4:0:b0:38a:88b8:99af with SMTP id ffacd0b85a97d-38c51951c62mr17694566f8f.22.1738584916380; Mon, 03 Feb 2025 04:15:16 -0800 (PST) Date: Mon, 03 Feb 2025 12:14:37 +0000 In-Reply-To: <20250203-vma-v13-0-2b998268a396@google.com> Mime-Version: 1.0 References: <20250203-vma-v13-0-2b998268a396@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=11755; i=aliceryhl@google.com; h=from:subject:message-id; bh=/iebTH4KyLdhY2vEuCqOUZUvZGhbmsvxzfHd3W6xGAw=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnoLNJRZfEeR3BWa35+Ak8WITPcMAAKc+QWrycD kjuKVWdrXCJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ6CzSQAKCRAEWL7uWMY5 RrjdD/9d3vmLpQxCmf2tyIFTe9dQwpb1iBMSAvRFzmKu4O1tCa5YWfrnaXETZ4zvdhKTX3g2KKb 3ToivU/ExeDSdC3XpbFd2TrTitpNFUSYhuRbE3cVssUotjcfwhhJGblejNUYaxktpljVRyWObMD O5rGkdtaDB1BvhYoKzFlGK/3SwId0AmDQ29cFKVdYh28Uk+edlT8idXS0b46aRkmqvY39kjG2Gt b+HtofyaMg3gqzMGgfIv2avEIm2WGpcxrTMou87zLGpzfbpWknwXhLiOrX9zxtte6CyZPB5gNm6 rht0AEZXCEOPEN2ISN0zBoITZR7B5yrQJmlZ06JaRSzX5wfTObamyJbJmys9FCoLfAI2tQfQG5d 7t98kzkwdYl8XfmEBFT9e/7Kld5Hpa1XzQ+RSkBXYbjY3vmhKti5Ev7eX1YVMx42sGFgO7LvxNv cErtZCKMWoqxs6SfRz8WYi0ES19Ip63L1qWPda2Uu28USt3UBuPHHgxDgq1Vcbj+RmivpQaF/4I C7tBYlQevnitMD7q0YsGC6wMGRlLJ9ZbpVxOcw838c2tMn59KGT35NlRMzhzocKXlvnL9Nc7ndj /6FzQf6+ugL4N3/zCXMT1W1u6AaAoYY0WqgsLPNNQj3F0Prz2+BvJCloJ1aRSlhUBoquwK23CQ6 h5I822DNznDWZSg== X-Mailer: b4 0.13.0 Message-ID: <20250203-vma-v13-2-2b998268a396@google.com> Subject: [PATCH v13 2/8] mm: rust: add vm_area_struct methods that require read access From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 0066120002 X-Stat-Signature: 8xm8qkoyuurfxmyp5zbee6o3somyftby X-Rspam-User: X-HE-Tag: 1738584917-6341 X-HE-Meta: U2FsdGVkX19BjSm8jR5Y48fho3R5KK38Rrx+xzAM2Tt8dTIwn2h9SQP75abZjB0u/NHt/EnrsTa1yUChtsMAE+EUYNbcniz56NA6XgwB6SKNCqk97chUv12JFQVJXqohj8fdO3CWFCi5hGHY+j9haM/aSb+85fRdiNoC4Gu+a9I/pmVf0o4DSxb5mRrudjSBFK5L87X0SjVwSOMxN91S4Vj0XiOPdqCDpa9BAgwnvD4edBQeA2mAF13ZxTmOYfb+cwGIbWVTNmkSXF1ZWEzv8v4Fya0H/YXDgNXKxHGzywt/FMmjbw0kfQCbbgCJBOde2eu9JcLMDb8SayTj+QZDk6E2NyFMd/CA/GMKnqoe0XbiDiW2q6zpS/n6n1aZLlp8bsrOVf/56sSpFYpwR8vnbXJXDAAyEzlF/ydGpFw3wQGhxS7Q5JUkIoR/AeakPgv5DYNh1aiJMgGyfvtYYjDu4TBOPs73MSei6PyQGivBe7yy4BM8Mphcj1br5Z/ZlgRsCchMEyjY9KahagOI4fKU7JAtvbxJfRRjqnOMz3rqi3apfmQlHiVG2FccP43An9/JpFU0qhNUoFkYdu4VNGEDubtUbJDCV77dmIg/iHzOnubN1RV48491wMQMmud2OEx8AEAiAvEiITa/T8DVlbLjngIJCyKcHsf9tq1J6ljoP9vwWmjnQURWPQTRaNLNGuxCBAx2YvX6sgSWgyE1FmNxqpQ/QuinZ5UC8ywjMYI04O11blO2hym4p9WQzP+svo+OAHsFtWihss0ZBb0YE3RP/3729ZWrIPBfmlYBOfy9jbgAryvh2iCyVAvH95LJuHDzeRlmQwgS4fKJLo8al7Z/zYJSHQU7YfhG6mD1Y5PD7NGCTxq4KMy2z/i/YEco+l36dw8T7UBOB0Z0YshWJIDBNj+Spxd7zPmK1xGlRllfux78gUgZBNrk/q0puNiAPLRnoE6nogV9yEC/t4uYwWu UZxCUOjR W7j2Wm01iUUuJb6TqigjqUj8C+6ASq+HgIoSz6h3QdCZOfQvSLEuYR6d080SpVz79LygU+B6nuF9LZWjmF2+KQ9nKl0GwQw1mNvPmYDaPOWMLN99ecCXBxWk+eaNG6hP3nAxAFhKOZB2Nr+c34EP0TAEvP5/VOLBYnorzgQCJ7P7MMnQgGfjVdVVyjGg/g9YjAVUdqVJPGtT9vs88ug0524WfMeqs9j2X3Cj3wh0wSUxcH1jnM5gT4atHg4Di8HadOtYzqCK/9zB42mvm/KbwZFlMqY4wmS51rYddB3eiSCvxARbjKSr1wDdLiyiqPMt2kyCnN7Gs8cvLnjCvZbYqck+CpIQjeqwKHlcZ8nAUPDOKTYvptL2tvq1PUoVIdKMO6gn/4KHLJpnOV4xDt7R78y5MgPHxnAxkbhPV13lSqiUNWW5ppUDfTfUw4wbDJJ6r+PK6jvfudO8osbvYxTgx9D+BhcF1qCr+wJq7YzLt5FZojubgTtp3BCJgwplQxvAOkOy55grKA/VunIHT3cAtFNtVlsZ8qUc54LCCEaWs3qHQZPURKQ2SqwOYuBtNp54xDhIJGrMizI80GbUzJEubgmJEbL0M4M1PVcqwYzWMDVWM6lNWC1inGS1/JGmvqzW+Rm0zc+WSnbFJA/bZ/B+kfaTl+4/0O0szGDQSZbrMfGRuUAAnXxGwPu1huVrN4gffkzbppEppgL+X4OwT1wXF8a0uIsIzRPJ/Xj/i X-Bogosity: Ham, tests=bogofilter, spamicity=0.309983, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This adds a type called VmAreaRef which is used when referencing a vma that you have read access to. Here, read access means that you hold either the mmap read lock or the vma read lock (or stronger). Additionally, a vma_lookup method is added to the mmap read guard, which enables you to obtain a &VmAreaRef in safe Rust code. This patch only provides a way to lock the mmap read lock, but a follow-up patch also provides a way to just lock the vma read lock. Acked-by: Lorenzo Stoakes Reviewed-by: Jann Horn Reviewed-by: Andreas Hindborg Signed-off-by: Alice Ryhl --- rust/helpers/mm.c | 6 ++ rust/kernel/mm.rs | 21 +++++ rust/kernel/mm/virt.rs | 215 +++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 242 insertions(+) diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c index 7201747a5d31..7b72eb065a3e 100644 --- a/rust/helpers/mm.c +++ b/rust/helpers/mm.c @@ -37,3 +37,9 @@ void rust_helper_mmap_read_unlock(struct mm_struct *mm) { mmap_read_unlock(mm); } + +struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm, + unsigned long addr) +{ + return vma_lookup(mm, addr); +} diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs index 2fb5f440af60..bd6ff40f106f 100644 --- a/rust/kernel/mm.rs +++ b/rust/kernel/mm.rs @@ -17,6 +17,8 @@ }; use core::{ops::Deref, ptr::NonNull}; +pub mod virt; + /// A wrapper for the kernel's `struct mm_struct`. /// /// This represents the address space of a userspace process, so each process has one `Mm` @@ -200,6 +202,25 @@ pub struct MmapReadGuard<'a> { _nts: NotThreadSafe, } +impl<'a> MmapReadGuard<'a> { + /// Look up a vma at the given address. + #[inline] + pub fn vma_lookup(&self, vma_addr: usize) -> Option<&virt::VmAreaRef> { + // SAFETY: We hold a reference to the mm, so the pointer must be valid. Any value is okay + // for `vma_addr`. + let vma = unsafe { bindings::vma_lookup(self.mm.as_raw(), vma_addr) }; + + if vma.is_null() { + None + } else { + // SAFETY: We just checked that a vma was found, so the pointer is valid. Furthermore, + // the returned area will borrow from this read lock guard, so it can only be used + // while the mmap read lock is still held. + unsafe { Some(virt::VmAreaRef::from_raw(vma)) } + } + } +} + impl Drop for MmapReadGuard<'_> { #[inline] fn drop(&mut self) { diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs new file mode 100644 index 000000000000..dfe147cafdb3 --- /dev/null +++ b/rust/kernel/mm/virt.rs @@ -0,0 +1,215 @@ +// SPDX-License-Identifier: GPL-2.0 + +// Copyright (C) 2024 Google LLC. + +//! Virtual memory. +//! +//! This module deals with managing a single VMA in the address space of a userspace process. Each +//! VMA corresponds to a region of memory that the userspace process can access, and the VMA lets +//! you control what happens when userspace reads or writes to that region of memory. +//! +//! The module has several different Rust types that all correspond to the C type called +//! `vm_area_struct`. The different structs represent what kind of access you have to the VMA, e.g. +//! [`VmAreaRef`] is used when you hold the mmap or vma read lock. Using the appropriate struct +//! ensures that you can't, for example, accidentally call a function that requires holding the +//! write lock when you only hold the read lock. + +use crate::{bindings, mm::MmWithUser, types::Opaque}; + +/// A wrapper for the kernel's `struct vm_area_struct` with read access. +/// +/// It represents an area of virtual memory. +/// +/// # Invariants +/// +/// The caller must hold the mmap read lock or the vma read lock. +#[repr(transparent)] +pub struct VmAreaRef { + vma: Opaque, +} + +// Methods you can call when holding the mmap or vma read lock (or stronger). They must be usable +// no matter what the vma flags are. +impl VmAreaRef { + /// Access a virtual memory area given a raw pointer. + /// + /// # Safety + /// + /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap or vma + /// read lock (or stronger) is held for at least the duration of 'a. + #[inline] + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self { + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a. + unsafe { &*vma.cast() } + } + + /// Returns a raw pointer to this area. + #[inline] + pub fn as_ptr(&self) -> *mut bindings::vm_area_struct { + self.vma.get() + } + + /// Access the underlying `mm_struct`. + #[inline] + pub fn mm(&self) -> &MmWithUser { + // SAFETY: By the type invariants, this `vm_area_struct` is valid and we hold the mmap/vma + // read lock or stronger. This implies that the underlying mm has a non-zero value of + // `mm_users`. + unsafe { MmWithUser::from_raw((*self.as_ptr()).vm_mm) } + } + + /// Returns the flags associated with the virtual memory area. + /// + /// The possible flags are a combination of the constants in [`flags`]. + #[inline] + pub fn flags(&self) -> vm_flags_t { + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this + // access is not a data race. + unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags } + } + + /// Returns the (inclusive) start address of the virtual memory area. + #[inline] + pub fn start(&self) -> usize { + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this + // access is not a data race. + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_start } + } + + /// Returns the (exclusive) end address of the virtual memory area. + #[inline] + pub fn end(&self) -> usize { + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this + // access is not a data race. + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_end } + } + + /// Zap pages in the given page range. + /// + /// This clears page table mappings for the range at the leaf level, leaving all other page + /// tables intact, and freeing any memory referenced by the VMA in this range. That is, + /// anonymous memory is completely freed, file-backed memory has its reference count on page + /// cache folio's dropped, any dirty data will still be written back to disk as usual. + /// + /// It may seem odd that we clear at the leaf level, this is however a product of the page + /// table structure used to map physical memory into a virtual address space - each virtual + /// address actually consists of a bitmap of array indices into page tables, which form a + /// hierarchical page table level structure. + /// + /// As a result, each page table level maps a multiple of page table levels below, and thus + /// span ever increasing ranges of pages. At the leaf or PTE level, we map the actual physical + /// memory. + /// + /// It is here where a zap operates, as it the only place we can be certain of clearing without + /// impacting any other virtual mappings. It is an implementation detail as to whether the + /// kernel goes further in freeing unused page tables, but for the purposes of this operation + /// we must only assume that the leaf level is cleared. + #[inline] + pub fn zap_page_range_single(&self, address: usize, size: usize) { + let (end, did_overflow) = address.overflowing_add(size); + if did_overflow || address < self.start() || self.end() < end { + // TODO: call WARN_ONCE once Rust version of it is added + return; + } + + // SAFETY: By the type invariants, the caller has read access to this VMA, which is + // sufficient for this method call. This method has no requirements on the vma flags. The + // address range is checked to be within the vma. + unsafe { + bindings::zap_page_range_single( + self.as_ptr(), + address, + size, + core::ptr::null_mut(), + ) + }; + } +} + +/// The integer type used for vma flags. +#[doc(inline)] +pub use bindings::vm_flags_t; + +/// All possible flags for [`VmAreaRef`]. +pub mod flags { + use super::vm_flags_t; + use crate::bindings; + + /// No flags are set. + pub const NONE: vm_flags_t = bindings::VM_NONE as _; + + /// Mapping allows reads. + pub const READ: vm_flags_t = bindings::VM_READ as _; + + /// Mapping allows writes. + pub const WRITE: vm_flags_t = bindings::VM_WRITE as _; + + /// Mapping allows execution. + pub const EXEC: vm_flags_t = bindings::VM_EXEC as _; + + /// Mapping is shared. + pub const SHARED: vm_flags_t = bindings::VM_SHARED as _; + + /// Mapping may be updated to allow reads. + pub const MAYREAD: vm_flags_t = bindings::VM_MAYREAD as _; + + /// Mapping may be updated to allow writes. + pub const MAYWRITE: vm_flags_t = bindings::VM_MAYWRITE as _; + + /// Mapping may be updated to allow execution. + pub const MAYEXEC: vm_flags_t = bindings::VM_MAYEXEC as _; + + /// Mapping may be updated to be shared. + pub const MAYSHARE: vm_flags_t = bindings::VM_MAYSHARE as _; + + /// Page-ranges managed without `struct page`, just pure PFN. + pub const PFNMAP: vm_flags_t = bindings::VM_PFNMAP as _; + + /// Memory mapped I/O or similar. + pub const IO: vm_flags_t = bindings::VM_IO as _; + + /// Do not copy this vma on fork. + pub const DONTCOPY: vm_flags_t = bindings::VM_DONTCOPY as _; + + /// Cannot expand with mremap(). + pub const DONTEXPAND: vm_flags_t = bindings::VM_DONTEXPAND as _; + + /// Lock the pages covered when they are faulted in. + pub const LOCKONFAULT: vm_flags_t = bindings::VM_LOCKONFAULT as _; + + /// Is a VM accounted object. + pub const ACCOUNT: vm_flags_t = bindings::VM_ACCOUNT as _; + + /// Should the VM suppress accounting. + pub const NORESERVE: vm_flags_t = bindings::VM_NORESERVE as _; + + /// Huge TLB Page VM. + pub const HUGETLB: vm_flags_t = bindings::VM_HUGETLB as _; + + /// Synchronous page faults. (DAX-specific) + pub const SYNC: vm_flags_t = bindings::VM_SYNC as _; + + /// Architecture-specific flag. + pub const ARCH_1: vm_flags_t = bindings::VM_ARCH_1 as _; + + /// Wipe VMA contents in child on fork. + pub const WIPEONFORK: vm_flags_t = bindings::VM_WIPEONFORK as _; + + /// Do not include in the core dump. + pub const DONTDUMP: vm_flags_t = bindings::VM_DONTDUMP as _; + + /// Not soft dirty clean area. + pub const SOFTDIRTY: vm_flags_t = bindings::VM_SOFTDIRTY as _; + + /// Can contain `struct page` and pure PFN pages. + pub const MIXEDMAP: vm_flags_t = bindings::VM_MIXEDMAP as _; + + /// MADV_HUGEPAGE marked this vma. + pub const HUGEPAGE: vm_flags_t = bindings::VM_HUGEPAGE as _; + + /// MADV_NOHUGEPAGE marked this vma. + pub const NOHUGEPAGE: vm_flags_t = bindings::VM_NOHUGEPAGE as _; + + /// KSM may merge identical pages. + pub const MERGEABLE: vm_flags_t = bindings::VM_MERGEABLE as _; +} From patchwork Mon Feb 3 12:14:38 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13957350 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2B3AC02192 for ; Mon, 3 Feb 2025 12:15:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4C250280002; Mon, 3 Feb 2025 07:15:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 471A1280001; Mon, 3 Feb 2025 07:15:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2ED43280002; Mon, 3 Feb 2025 07:15:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 0F543280001 for ; Mon, 3 Feb 2025 07:15:27 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 01DFC142618 for ; Mon, 3 Feb 2025 12:15:21 +0000 (UTC) X-FDA: 83078528484.06.5916181 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf01.hostedemail.com (Postfix) with ESMTP id 1167C40004 for ; Mon, 3 Feb 2025 12:15:19 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=arpGYVRd; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf01.hostedemail.com: domain of 3VrOgZwkKCIMhspjly5osnvvnsl.jvtspu14-ttr2hjr.vyn@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3VrOgZwkKCIMhspjly5osnvvnsl.jvtspu14-ttr2hjr.vyn@flex--aliceryhl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738584920; a=rsa-sha256; cv=none; b=RraIeQc0o5Br/hMjIEMh0RKAAVe/++2bV9cR9jc4I27xwpBVw1ciGPK/SMNGRmL8P5x2R+ wqnUK/IgYh4MpM9JqeaB46GS36YCTaY+gj7j0X8E2TCE1vdxyJKTQj6N2ZsC5Nd77aYioI EEAvM6Jrsx2VxmGZkCp5Xm3F18/hY1A= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=arpGYVRd; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf01.hostedemail.com: domain of 3VrOgZwkKCIMhspjly5osnvvnsl.jvtspu14-ttr2hjr.vyn@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3VrOgZwkKCIMhspjly5osnvvnsl.jvtspu14-ttr2hjr.vyn@flex--aliceryhl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738584920; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ghNopB/ovoE7UI8P7ondlQHgR2/c2oc3ISyJfF1pYKk=; b=M1DPOD+RXF3YgdNEFE/SYGJKZf+QItErUHbDYpEyZNPqd/kEM0JHjwjYBPPQBfkZkpk68b vVbP6/g1nO07cffEEuM5kzdZyJVetJ9QYRGkM1oDVIaKEclhauqbA2/kR46sRdmEgqyoBg g2Ji+f+hLURTZFkOIoM8JpZHCZOy+a4= Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-4361fc2b2d6so22273845e9.3 for ; Mon, 03 Feb 2025 04:15:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738584919; x=1739189719; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ghNopB/ovoE7UI8P7ondlQHgR2/c2oc3ISyJfF1pYKk=; b=arpGYVRd3ed8V7om9etmJMjnlI32sQQyj0ObuUBCKuWwdVS1QfwooJetqikBiYXHAz +jyiACH7kX6OIZ3StNMZAuv40hGWr/op2e7VRIdxy4CVZfQc+Q8QwlvbDfQR6lX8Thbl Rh5ssShSL8ca1+xXddNXlCXwuaSo1d0lAarxWExGOTufWJiPBlHVy/dZtqw7PvqcDx6Q fjnblCozui3vI/Ye+wpux2Y9Tpv9UNuZcW60IpKrDiB1x9vXtjVS9QZYFJTHd9rikpA0 MoHH+SAUeaayeI/xdnV+xvJWTbf9mFX598H/VVDJjjyvVvA2zLtqtlTK0GN/isOnKVTs qCYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738584919; x=1739189719; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ghNopB/ovoE7UI8P7ondlQHgR2/c2oc3ISyJfF1pYKk=; b=GuYFem2TwCYEuwIUYEQrsYJw+TipejpQVm9D4Q7OvrnV4gTg3WCZYyMpCHGFBsRr8m 1BRhQqphX+BXcoevVjgYzaQZnTzcNMiw//2o96jGY5NkHdRvjuvJXHSq8348MmIjaFTT iYpckNC/0xqxrSKTKm2JpHLP1qnR2gleWbjSiIxppugPNlQcn17s7V2wSjGsIdqWXTuI LLoExwp0iYsBrUmXSwUt65lnvwj8ObJ1n4rfGiGZyQeLjyGU6Hrtc4K3Y7fqs26FQ2N6 b3r2hwj5hh2QiVrRgvJkhUCynjBlAvA1tVo04b8bCmfRX0QYEtbMjphWb3yV1fhwZdqj NilA== X-Forwarded-Encrypted: i=1; AJvYcCUgDyajEufq1vo3fRIEsXbUYhybji6iwtCmrBpiyEEJkSnEvdBVnmPAAJOvaqsd/1qBMt33P0nxfA==@kvack.org X-Gm-Message-State: AOJu0YwYHjAi3JE84/QuPfGiwWxIbIMViJZucSO9VK33pXiuxeVyYKvG FXJOQmZO6Qk4gjge+FPc039U0xHFffbjhqJKRzS+Paw/iHWeeH/7JqiSOw/syjzbcYjpEYwrr2y aYOpnH0ZISmWq2A== X-Google-Smtp-Source: AGHT+IHc83C8k0kWyFW4B7jVylfVU4gtl0Vj6eJsSbBowS4jYe9JFHSyrYsbEeWgcs2CoUj+GFAgp46bZhkRTJQ= X-Received: from wmqd12.prod.google.com ([2002:a05:600c:34cc:b0:435:21e:7bec]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:c12:b0:431:5aea:95f with SMTP id 5b1f17b1804b1-438dc3cc378mr206991585e9.16.1738584918711; Mon, 03 Feb 2025 04:15:18 -0800 (PST) Date: Mon, 03 Feb 2025 12:14:38 +0000 In-Reply-To: <20250203-vma-v13-0-2b998268a396@google.com> Mime-Version: 1.0 References: <20250203-vma-v13-0-2b998268a396@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=4198; i=aliceryhl@google.com; h=from:subject:message-id; bh=7WepNBMO3VHpBbUgOZ7CBvykk1H/XsDvQNuHzOoGaDg=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnoLNKxtxYKi99rKgHmqWkRH4nWkPittKx4tZr1 V2PKQs0+sWJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ6CzSgAKCRAEWL7uWMY5 Rs6nD/9iQkAZn6Vn+uopIfzFcx5caHl87DWWgkBI+qGs0DLDUpqfdZDDlnPBuD/dIBc3jKZ+uo/ Lh/A/voziPGDymGXzHHnNm3JY90Ik7PLD0SLTpzLw2wr12kKYE7LB9Q/xjIVchr710cvfcBhhpn P20GzEcAMsY437I0tnAMaqFRIcIKMO5HVJiXPNJf4akkxzze/3S5i+I76xujCoQiwlO9hACF9x+ tw9+DxI+r62S0RiAUPYSKIEpHWeBCg86V+YKbjn8v50FOXF5ABV+2X8jb/AEVQC6JtVA6DoUNF0 K1+5WN0lBG+em72x4Vcyng5VBjsKX+mQcRpqPbf34ClRqkcbuTtHrDI5Epdwq+xqc5waUutF4Zt 8KBwkMqh9PdEKoFWBw1/6ERz52tmjV8oGdbf6tEgsDgorhKWMEy0HiAeTSLDE6kyujBfFGnCOFm N6eGYTpbEALKFem+VPN4SFR25IHNffR1fF2Pf/Al0m69SBcsOwtAffp5xdxC8au4qiF7nfhuSW7 MHl5N3G7w4IGSxRAlse7Oes7RJKU3nViNhOjCSncG6LfUE7Rf+O6eog2MP62j3dvJJ67znZCm+d EX1c6LqELEE0UFq/3MT9Sn8Qw9H1jmYZFsOFYY9hgAVSfYVUKSwpK17WB+dP17CvpWB3wwlVa/P KQKZVuibB3ekJag== X-Mailer: b4 0.13.0 Message-ID: <20250203-vma-v13-3-2b998268a396@google.com> Subject: [PATCH v13 3/8] mm: rust: add vm_insert_page From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Rspam-User: X-Rspamd-Queue-Id: 1167C40004 X-Rspamd-Server: rspam10 X-Stat-Signature: dxh8683twm4idk8dsqm7q4q9qxcb5brh X-HE-Tag: 1738584919-393579 X-HE-Meta: U2FsdGVkX187UzmTI8+/MoirNfbPMBjaHwZ5zlqlL9QKnymTMw7NvKxhcFk0XjjFzbb/T0D69u0FkSLDnDk0C/yjmbsymw2JWCKTUOyV3qmmlVm2QqYpyEUIOnpV57VR3IbC2GCSefHEIqS03hUANJ5oAAUMKBp0t/jVU5NbnoSPqnU4AfGt2Mxh7L1bk/vZwUfsz5wI61vjCI7PKgNGNResnLiw9RV8/PqHmerxc4mwpF0qYdmDOjFL3jXIi4CYecU+7nlnn8l6NrHdvNXMNCn0WDW3zXCqMlCzgMLiCI7wOZARsJQuorYN5NBOgo7yICT/aScm4WFXo9+nJuwB98vEORwQggYFQeiE5BzqBNB2gt7LG+LTjAxPa75cjbpsDAL3o+FNnhpUlPNBOATGZ/k6qhvBYJQFL0AMqJOYJh4/vnPnvg4M5Bz6KyTvUaXTl7wjhTCzuv+2B9w6mYcRnpJbK5CbgKI2FgYl79xEo7TjUa/NS5zZoAjkz9Nh0SfODqKIODyF7zCkgDNl4QTwOMB4haXYkbMMf+ZOhKdsEAPigxrss7AQ/zMoAnN8nMheawjkxU2eBAPyx9ochqb4teZIKK8AoRl6t8b3KMB2H/PJkFvqCudlKZDaPAg0YISdP6Cij7a3rfhyu/98kqwwj5MJE14xZg2g48XHI6KtP1G/UzxddfaAoVAcXGcodZjYcSChAX1T6spoRFemAbqZEn8tPjbgFESn46bgiW5kbX0bHHCQROFR0GgvTsIW9q8mkQs/YPZURyFMSlUuiE6qzGWXWh+RfLn6sa+mZcCye55HdNU/XJaG4yIHXJsRJcX2B30RJOBjpOLAi7Vbpevyq70P66JlO6HOssA8Kt/hsyy/6rQzqA2VYoiCF9v6BHifzyvxqEz218LR/G0a7HWWsYAMdHBuJxkdAjP+aFZv2COuhEUeYe/OeCrBawrjgmlToIXi6VQjrL0o6c+1ySY nnOKkV50 sLtOoijv/Jhf3p/vjD+2uzMWleTW4Ura/8YMmb1bha8Fzv/VgKrfOSjfE3tf2qPmo4fxe+uR+PgUc1cE8N8G8l36OBXsSsQYANv+JkMTez2BIloEx1y45SYYIPibVos4uZUZLQubzkbMmgZVepKFeXn3olyGMj7ei+62i388wT05VGAB1seiDtp/CAYxI8lYe6Z8SeTvsHQ31zkagBUzHpzoP+USwhGGN5carZcKN9EI3dxxqjfkVrbbWjoTPgWImn29JBmLeSYf8brxQGSrgxmoVKDP38LzJOd8PzsNKPjXzG81OnyAKFtbWkn8lGAXDwJDnn2gxHI949ySA+t+F9wah+2wXFy6av155LSQichSIYtQGhpdLDAw+pwXFq3G+ypa6NgX9z5aoBHw1kzLKQ+bW39wTfvzDtjZJ0ZYbyA2iTdeLAxZ4i8bQjXj3xAjRWJFD9FUFTNWf7LhLUJDuMBOHdJ40xBjuVi7njW17ae2q/irD6C/JrscS4bDyTNUrOI9Vao579zn9yXL5csRqPBZbtKkON5Y25m8IKaa7yPJXwfH1WB7NbEfUQKuJFizqKx3BuynjRf9JQzsN0SM+2XY9xWjeIBO7z+kE5uYNopzJRv5dpBCKme1YQ6PhQogOEZWky9G2CPVR040kDQ2bpos4wDCnoi1YOmcA7+6XDssYYK6v6vScRIVP1B4bcLGTbA9BOZOz6CeBj/qKMJOv14j2tH8brWcgeUES X-Bogosity: Unsure, tests=bogofilter, spamicity=0.488060, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The vm_insert_page method is only usable on vmas with the VM_MIXEDMAP flag, so we introduce a new type to keep track of such vmas. The approach used in this patch assumes that we will not need to encode many flag combinations in the type. I don't think we need to encode more than VM_MIXEDMAP and VM_PFNMAP as things are now. However, if that becomes necessary, using generic parameters in a single type would scale better as the number of flags increases. Acked-by: Lorenzo Stoakes Reviewed-by: Andreas Hindborg Signed-off-by: Alice Ryhl --- rust/kernel/mm/virt.rs | 79 +++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 78 insertions(+), 1 deletion(-) diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs index dfe147cafdb3..64a0a47070a8 100644 --- a/rust/kernel/mm/virt.rs +++ b/rust/kernel/mm/virt.rs @@ -14,7 +14,15 @@ //! ensures that you can't, for example, accidentally call a function that requires holding the //! write lock when you only hold the read lock. -use crate::{bindings, mm::MmWithUser, types::Opaque}; +use crate::{ + bindings, + error::{to_result, Result}, + mm::MmWithUser, + page::Page, + types::Opaque, +}; + +use core::ops::Deref; /// A wrapper for the kernel's `struct vm_area_struct` with read access. /// @@ -124,6 +132,75 @@ pub fn zap_page_range_single(&self, address: usize, size: usize) { ) }; } + + /// If the [`VM_MIXEDMAP`] flag is set, returns a [`VmAreaMixedMap`] to this VMA, otherwise + /// returns `None`. + /// + /// This can be used to access methods that require [`VM_MIXEDMAP`] to be set. + /// + /// [`VM_MIXEDMAP`]: flags::MIXEDMAP + #[inline] + pub fn as_mixedmap_vma(&self) -> Option<&VmAreaMixedMap> { + if self.flags() & flags::MIXEDMAP != 0 { + // SAFETY: We just checked that `VM_MIXEDMAP` is set. All other requirements are + // satisfied by the type invariants of `VmAreaRef`. + Some(unsafe { VmAreaMixedMap::from_raw(self.as_ptr()) }) + } else { + None + } + } +} + +/// A wrapper for the kernel's `struct vm_area_struct` with read access and [`VM_MIXEDMAP`] set. +/// +/// It represents an area of virtual memory. +/// +/// This struct is identical to [`VmAreaRef`] except that it must only be used when the +/// [`VM_MIXEDMAP`] flag is set on the vma. +/// +/// # Invariants +/// +/// The caller must hold the mmap read lock or the vma read lock. The `VM_MIXEDMAP` flag must be +/// set. +/// +/// [`VM_MIXEDMAP`]: flags::MIXEDMAP +#[repr(transparent)] +pub struct VmAreaMixedMap { + vma: VmAreaRef, +} + +// Make all `VmAreaRef` methods available on `VmAreaMixedMap`. +impl Deref for VmAreaMixedMap { + type Target = VmAreaRef; + + #[inline] + fn deref(&self) -> &VmAreaRef { + &self.vma + } +} + +impl VmAreaMixedMap { + /// Access a virtual memory area given a raw pointer. + /// + /// # Safety + /// + /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap read lock + /// (or stronger) is held for at least the duration of 'a. The `VM_MIXEDMAP` flag must be set. + #[inline] + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self { + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a. + unsafe { &*vma.cast() } + } + + /// Maps a single page at the given address within the virtual memory area. + /// + /// This operation does not take ownership of the page. + #[inline] + pub fn vm_insert_page(&self, address: usize, page: &Page) -> Result { + // SAFETY: By the type invariant of `Self` caller has read access and has verified that + // `VM_MIXEDMAP` is set. By invariant on `Page` the page has order 0. + to_result(unsafe { bindings::vm_insert_page(self.as_ptr(), address, page.as_ptr()) }) + } } /// The integer type used for vma flags. From patchwork Mon Feb 3 12:14:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13957355 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D844AC02192 for ; Mon, 3 Feb 2025 12:17:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6BF3D280013; Mon, 3 Feb 2025 07:17:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 66F00280010; Mon, 3 Feb 2025 07:17:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 538EA280013; Mon, 3 Feb 2025 07:17:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 36F51280010 for ; Mon, 3 Feb 2025 07:17:52 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 445B41A2A0B for ; Mon, 3 Feb 2025 12:15:24 +0000 (UTC) X-FDA: 83078528610.28.8C0788E Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf19.hostedemail.com (Postfix) with ESMTP id 54D811A000F for ; Mon, 3 Feb 2025 12:15:22 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=3FZRO5Wq; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf19.hostedemail.com: domain of 3WbOgZwkKCIYkvsmo18rvqyyqvo.mywvsx47-wwu5kmu.y1q@flex--aliceryhl.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3WbOgZwkKCIYkvsmo18rvqyyqvo.mywvsx47-wwu5kmu.y1q@flex--aliceryhl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738584922; a=rsa-sha256; cv=none; b=Ml35338H+K8yBIWc1REdKT+1bvOB6OFBk7gah/ThtmrxQyv4GZlivjqU+e9lsmHIaj9dbb h8rV3bpJF3COVCCNagurDUagedd1bhgQJe6kHLNdkRbawOcv5fbO2AC84R4DjCTh/QyN9O cwen77Zolj1r05q2oqFZh3r2Qr46Ego= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=3FZRO5Wq; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf19.hostedemail.com: domain of 3WbOgZwkKCIYkvsmo18rvqyyqvo.mywvsx47-wwu5kmu.y1q@flex--aliceryhl.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=3WbOgZwkKCIYkvsmo18rvqyyqvo.mywvsx47-wwu5kmu.y1q@flex--aliceryhl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738584922; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=agGGNAI6dzqkGe48/M6YrHXUKPC4ck0xMNLxp3OFHVg=; b=pjKQHosHJ2K8nTunqULESbMLF9eBgc/Q6FSfVa9Z8objerf6QN2OwCxbVUPkDM+cIvMXIr DgQIl4EasaU8OAyjwY4Wakma/uOySPNSGuPmZbtH/bP5dfFkeHZfQBGuAC00nVXs7IqiB3 9+tSnN+GDk67z1Cmgyqgm9RnDd5HhOk= Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-4361b090d23so21078985e9.0 for ; Mon, 03 Feb 2025 04:15:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738584921; x=1739189721; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=agGGNAI6dzqkGe48/M6YrHXUKPC4ck0xMNLxp3OFHVg=; b=3FZRO5Wq5XrPZscd2MlIY+RZs4MHU5xjxtQPm+idceCgmKEnnqFUE6nN7cvegUR+b2 kcDmm/yvISH/jjlON8sqKjqoVHZOCEtVzLmGAv9cKvmGWdSgxt37nJUPG+GqTZb4qoqI eQpJJc5b6NfCGj1HRrjWbrCKPPmQo2LPQ24mA4ea1rX6iItcp7L3Nz2nKtqRSsjRct80 CLbYBYV5c+xCruaT24Dmeyoj0tjnVn8BOwk4j7Tspe/dpkFSDxgpNOHgKeCFUaCnnF9D RHX0y2j8p4JObvG3O3jI9tiiMtT1AUY+R5Git6Bjyg7DlxjKXf8Ow/n+azvfsutZrTlU rKBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738584921; x=1739189721; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=agGGNAI6dzqkGe48/M6YrHXUKPC4ck0xMNLxp3OFHVg=; b=max6l9zrGt1dJ88VdEDD3pv40PnW4wDlPrNi6qNYTWQY+Q7dS2Tn2drcZfdFmUlSzP f9qzRTPmZacN2FJy+mjR/Hw02TJ4EehxbFjk3Yn2yPMXtXBABeUGD3uYXHD0yO0j5SZU LG8dZIV9WCA6Ws3n3Gf0z/FlXKusun7S+r/UXZegkGuzUpyxS46p9jn4D5Hv8A7zFiLn icmqT4w5wqh2fmvFNd5UTLRX9OPyGPxVFV1ywKrSxhIfb227qwTM+y4CW7yt5BpEz3q8 5ykHbCWFOt8zb9Iob47s/cvjNISbbhWoP0eLY/mBwnIO2QNgErGrAsQwn4OX/v8FPQUf W0vg== X-Forwarded-Encrypted: i=1; AJvYcCV5SYEnmNBcxS0Dam4M8+Z2BxeuaiinfNzfZiWaIKqiiv1LMjXGpkuR+TMy8G+AoEHs4pnbj9qjLQ==@kvack.org X-Gm-Message-State: AOJu0YxUjxMJkYAqsRzjQgkPl2W9AF2IZOHHAsZa1NcCNXPCOcZXvoHy kArHf33dmBmeyQeVHCKAlTUm5D+vXjRWdxD1acRC2oWdSDm3SA5iNtZy9AHjjKqP881Mvmy3XZN SxXdM1LaEcVIkpQ== X-Google-Smtp-Source: AGHT+IFE3v7o7/Sz3pTYj30D1OhHhtCVRBlS00wbtCjNS9IpojFNCzjRXbNgbN2xRkNUtsN78KvI86exjL/4/EU= X-Received: from wmgg2.prod.google.com ([2002:a05:600d:2:b0:434:a7ab:5eea]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:198b:b0:434:a468:4a57 with SMTP id 5b1f17b1804b1-438dc428740mr163084545e9.26.1738584921017; Mon, 03 Feb 2025 04:15:21 -0800 (PST) Date: Mon, 03 Feb 2025 12:14:39 +0000 In-Reply-To: <20250203-vma-v13-0-2b998268a396@google.com> Mime-Version: 1.0 References: <20250203-vma-v13-0-2b998268a396@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=4110; i=aliceryhl@google.com; h=from:subject:message-id; bh=4KqYDSUVevSLVAAHd54nWgHt2kwwb2fi92Kf30lYrZA=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnoLNLTSZ879yrBzzjRp+x2lwRObccwq8BnrIWd lf69cZ88PeJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ6CzSwAKCRAEWL7uWMY5 Rm5ND/9PWT1+Eh/dfNIY7NIkFk6+Vbw7I0kgRvqoHU1ZGmP6M+gJ8foSrPKQZqPleKcy7VGrBV+ xj9vOknJIzKNFILLuZE+BQjkcqJckAVx2VnWb+9Ow2q7nDrCa7cC914vve8qWv+6JFMvq6PqIyt 5/J875aL+HC1hE41ROziRZQWZZCwq7KW5n1Jn/CqrXkSfsIrBse+DFUhhp3JNGj55YQ9BzJ5ZdQ LKyXDblmpmH/npz8Sqx0/Ea0UuDEM0+MEYHzZ1XUjzJzwBjoO6eSuQBHyxCppe1CgarpUzRABWj 6f0ZsGQTIJbWK2IzLmYW++OoqjZZBghXwEM/uKfw/L6znFM6f1SMFwVG8OonHtPzKHdSPYZ+cUo 0BJor8b1WM/bhTiUvmWJlAclvh/NlrkoRvPrFHgs6RK3S3G4AC49XeIUzKyjA7h7x5sEV9/c7VX npGgaAfRUCrcrFd2Ba8zf7ctDYAS7JwPrBe3friwZvgJPpHcHs/mJ2Hsy3qQERz1HFBJCAA6+iq SqQ/Co+NxC4gMDJLQhKtexo6uuB4JJvSlDOGKTCkCa/UYE+YYkDozLx9k91TAbPQxTvLQHxLJxP +ZfsAY3m+RZ36I+d4hnDCouPrz0VkieyDtcs1aZ226wR/2fx7WmRm0X/EMFPmtbLonr2DAQm5va E2EB41wG9r8vPDQ== X-Mailer: b4 0.13.0 Message-ID: <20250203-vma-v13-4-2b998268a396@google.com> Subject: [PATCH v13 4/8] mm: rust: add lock_vma_under_rcu From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Rspam-User: X-Rspamd-Queue-Id: 54D811A000F X-Rspamd-Server: rspam10 X-Stat-Signature: 653r1pmsm38c6p9aagxffx8pjf34y7u5 X-HE-Tag: 1738584922-169611 X-HE-Meta: U2FsdGVkX1/lyHlfasDZIC8MDQ1HH1lTpALs2ER5NBBxqcDmXbNm7YosFXg9spqRghdXJxb8JmxT76iyblhAy4juEdp7xy4KYVwwID8q+5vR9lp+E5hj1RF7JtKfOTgFBHD5WTbDKo9uZ1RmGvrxqaDr2vKif4X97GgZ0IVkjlA7SVSWikEWgoYCaRk/d+ZiJbxgBjiwBiov1RjaWchglSiXynkIMD9alTSN64IUoXewK0ks4lnyYnNIlN5v1lrNViwKkgNnYIOV0l2IzNCNMedljWF6cTGwYdqWq7L+9t8Yx74u99D2mn+f9IiXmFwvmRUgUD7mn/eQXlrn+jD7YxPYGbnYuh+QbpISqeRIDGwnqxw+GZbaEP+4ac9GqmrJv+ehNIQALaJNIPICjpX/vGfgZHNQI44vtCqKQRuaNdX0npVOlHcWQoyoKbpE/LwmrL1YHrvaeUJL9Gyb4ThFZ1EdnFt+UvxSYhPe6f1UMIfMwPcc+1qNFDpzkXkzQ/u/xpoiER8STmtUj4EAOuGdqtCP83E6AIajTdJ085y0w6iY6NfcOTth0LBmzLI9+szw6op+NrUZdBNW+N/SDS90cgDI8kLvLdYL/kWNcE4vgXCFnJoaubZMmKwhz/VICBo/+iT3wtnMQI9nZcJwON6a2GYl0hX7qKhpRcsw+tFU2b0B7opVO8JwmoJF+RaQr/ve4e2Z+2jxzbxYBRDHVbdevDR9v/0KPIa++qwuKE32OgR+a/AejlFgstQeqCaV0hlQOxTvkwA3K4gTF97LxVcK6yH2czRxmwmj1xEY+26kIm5s/iEde5o4lATHXE/RBTI5JGHukZAZaN7bSvk+oaP2ZW72GAu8yTz86yxeRmwL0EPfHME2h2pyCjQiC1xe1iw2b1wpAJu4JDIv8EYbl0t7lDvnmZbC3i4d3fefSYvoRMgUmszE1xucaDdvb4bNZRaJ6I8fTTujd624VFsK6dn ceO1URbW sWrT060VRHSUm2z8GE443imdjNIvjJfEMjS/LJZBhTc3KO05nUfmOQYwraZjD06+P2ZqqKS2IZ+IGm80K8ch0cMZrYgf+6+Tqi0XlPVYkVoJq1r0NYwJfJ6Hkx8SsESMDS7uk/luwfqu9nzW+R0Zs4+LSqRztN/CuUwYyeHHH09IYPl+lR/e3bzk/KFNzy3GpFvm6JLHWSTxY0ox+XjOBrXgIytCiYLFuKyQKuZARAB2WyYDMXzGPl6VAnUoSjHaW9Kkqp6JWLegbi8gv1+5VFTI4E+bmTn41hp5B+upcbykbcNQMqtDXEQvmvBHvVKFD2bSnjbEsprZ4QjZ9EkSOHLhgqFZd7oyiayvzxsh9a5qrB5/6Uh+BCcv/53gVKPpLkFZmKJb7GpOZ6fCvb+ubvGKwlKSgbFkUe2+eOS59rOkzkrQ6mjA4Xwbz9IbVsLDZjqLmnoXtEnZ50401RvuKVQSi0V4Kx8wJevV0MIB74vvA5IROxLSCLzSL7KsgiWVZyYHC5yBMqYO/bl+0ot0xxWJCt38AvoGsvT8WqMwuBILRwSbf2ZomXLDwHdrBZop+pOcwj3wmQg2ts6oyt3KW7kmyDnjfCtaR+Vg4piK/0DKlo3/88+CS+n5RZkdCD0cXcCpkfyPo2K5LJIhkZisTOICgDJBDAh+yto7yUA2s9450JY+Mx6n54zgXqMa3V+2NG8S7cPfbeRFb+eeph1EBHuPzh3hVNXb/zPH3UMKfqqfyZOw= X-Bogosity: Unsure, tests=bogofilter, spamicity=0.484742, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently, the binder driver always uses the mmap lock to make changes to its vma. Because the mmap lock is global to the process, this can involve significant contention. However, the kernel has a feature called per-vma locks, which can significantly reduce contention. For example, you can take a vma lock in parallel with an mmap write lock. This is important because contention on the mmap lock has been a long-term recurring challenge for the Binder driver. This patch introduces support for using `lock_vma_under_rcu` from Rust. The Rust Binder driver will be able to use this to reduce contention on the mmap lock. Acked-by: Lorenzo Stoakes Reviewed-by: Jann Horn Reviewed-by: Andreas Hindborg Signed-off-by: Alice Ryhl --- rust/helpers/mm.c | 5 +++++ rust/kernel/mm.rs | 60 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 65 insertions(+) diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c index 7b72eb065a3e..81b510c96fd2 100644 --- a/rust/helpers/mm.c +++ b/rust/helpers/mm.c @@ -43,3 +43,8 @@ struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm, { return vma_lookup(mm, addr); } + +void rust_helper_vma_end_read(struct vm_area_struct *vma) +{ + vma_end_read(vma); +} diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs index bd6ff40f106f..88167c0cbd93 100644 --- a/rust/kernel/mm.rs +++ b/rust/kernel/mm.rs @@ -18,6 +18,7 @@ use core::{ops::Deref, ptr::NonNull}; pub mod virt; +use virt::VmAreaRef; /// A wrapper for the kernel's `struct mm_struct`. /// @@ -160,6 +161,36 @@ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser { unsafe { &*ptr.cast() } } + /// Attempt to access a vma using the vma read lock. + /// + /// This is an optimistic trylock operation, so it may fail if there is contention. In that + /// case, you should fall back to taking the mmap read lock. + /// + /// When per-vma locks are disabled, this always returns `None`. + #[inline] + pub fn lock_vma_under_rcu(&self, vma_addr: usize) -> Option> { + #[cfg(CONFIG_PER_VMA_LOCK)] + { + // SAFETY: Calling `bindings::lock_vma_under_rcu` is always okay given an mm where + // `mm_users` is non-zero. + let vma = unsafe { bindings::lock_vma_under_rcu(self.as_raw(), vma_addr) }; + if !vma.is_null() { + return Some(VmaReadGuard { + // SAFETY: If `lock_vma_under_rcu` returns a non-null ptr, then it points at a + // valid vma. The vma is stable for as long as the vma read lock is held. + vma: unsafe { VmAreaRef::from_raw(vma) }, + _nts: NotThreadSafe, + }); + } + } + + // Silence warnings about unused variables. + #[cfg(not(CONFIG_PER_VMA_LOCK))] + let _ = vma_addr; + + None + } + /// Lock the mmap read lock. #[inline] pub fn mmap_read_lock(&self) -> MmapReadGuard<'_> { @@ -228,3 +259,32 @@ fn drop(&mut self) { unsafe { bindings::mmap_read_unlock(self.mm.as_raw()) }; } } + +/// A guard for the vma read lock. +/// +/// # Invariants +/// +/// This `VmaReadGuard` guard owns the vma read lock. +pub struct VmaReadGuard<'a> { + vma: &'a VmAreaRef, + // `vma_end_read` must be called on the same thread as where the lock was taken + _nts: NotThreadSafe, +} + +// Make all `VmAreaRef` methods available on `VmaReadGuard`. +impl Deref for VmaReadGuard<'_> { + type Target = VmAreaRef; + + #[inline] + fn deref(&self) -> &VmAreaRef { + self.vma + } +} + +impl Drop for VmaReadGuard<'_> { + #[inline] + fn drop(&mut self) { + // SAFETY: We hold the read lock by the type invariants. + unsafe { bindings::vma_end_read(self.vma.as_ptr()) }; + } +} From patchwork Mon Feb 3 12:14:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13957354 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59023C02192 for ; Mon, 3 Feb 2025 12:15:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E3825280012; Mon, 3 Feb 2025 07:15:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DE7CF280010; Mon, 3 Feb 2025 07:15:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C8822280012; Mon, 3 Feb 2025 07:15:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A227A280010 for ; Mon, 3 Feb 2025 07:15:43 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 6862312275E for ; Mon, 3 Feb 2025 12:15:26 +0000 (UTC) X-FDA: 83078528652.14.25C10B4 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf03.hostedemail.com (Postfix) with ESMTP id 6F58920003 for ; Mon, 3 Feb 2025 12:15:24 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=KyNCgvng; spf=pass (imf03.hostedemail.com: domain of 3W7OgZwkKCIgmxuoq3Atxs00sxq.o0yxuz69-yyw7mow.03s@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3W7OgZwkKCIgmxuoq3Atxs00sxq.o0yxuz69-yyw7mow.03s@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738584924; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6wKw9Mczsy3GQd6f513sHnBPndDv3PUBzx6N+4yBSao=; b=QVc9WEnvUDBlzhR92HdxsCN74oYk88caiLVeuXalT2NpmwhTzPOl7dB7rfFl+fARczt2cu mJeYIY8BBbEzn0Tpw73btRs4z8JjU9RQcNaX/cHEWCzMhlq3YcTFl+61fsYi0zNXPkQyiA /j1buX6SDhFJnTlGg/C2Gll2Zu5WeCo= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=KyNCgvng; spf=pass (imf03.hostedemail.com: domain of 3W7OgZwkKCIgmxuoq3Atxs00sxq.o0yxuz69-yyw7mow.03s@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3W7OgZwkKCIgmxuoq3Atxs00sxq.o0yxuz69-yyw7mow.03s@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738584924; a=rsa-sha256; cv=none; b=R7PynJIoz4sjsqvHYLkf0696nNgHVcHP1SO/QMileVjN2vdw0m3kLXKLWRduq+FzZtKoZd 8ksiq1SdaM1UMsaPeLVvo+R5cib2bDLKNTfEFRzxuAApL11iacOEalgOYiXEdm1oMMYoBx r2Uj+xtFkqTn0oRDN1b3LVMuLje3B+0= Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-4361ecebc5bso21032285e9.1 for ; Mon, 03 Feb 2025 04:15:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738584923; x=1739189723; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=6wKw9Mczsy3GQd6f513sHnBPndDv3PUBzx6N+4yBSao=; b=KyNCgvng0/KlhW5iFkdcwxZ0OWnQ3tV9O+aJam4A0swtQMtN5RWCpRWUZkwJtu+sDc bt6kys0ft3Nqo3GC/TB6AvB3CfInazah5StD9WYScOJryvCf7MVNG/WIcXci1KdB0h/n +Ds8Q4ws9qYcck0F2U1nx9Jndgf7MhQ5gPEBfD1KhD7K6Yms2ZXHIfYuQCrOaDLm4qDQ MwyDlbMVOXl6AyUhUaZZAWvCmyqaWyuLS2/8wBsXbFUofQzn5SbCkSRUW+Vjm1FyHyjn K9X04LNB2fc4q0MAffndosehOHAGzN8vPhQ+5jzQ0xdbN7WhE4BumMlwqIgewEMv74+Z cg+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738584923; x=1739189723; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6wKw9Mczsy3GQd6f513sHnBPndDv3PUBzx6N+4yBSao=; b=qbnrFPcTXkUwHbtfK++URIqm06bC0Mpzg54bHbxs0UE7cAVsehzy2y3ZBfm8nmLOUz VGG4V07uPaCmyTuBpa2/rLr1t/67Pvr86lISDDtuPTqN8tEjBJcdIr90mca3uLzpLBoY Ly1ZFf1ub67hRqluGHXg13yyZQnuNa5EnFM2Zi8EwhB05vSaTlHJNyyMPE2kJIPnhVLW LgupwNlZwNo5TTKpWBhm+CWKySPuVVlAqauBRIuZC54zGG00DfH0UZXnWtQnEs7Bx40r srPWiyJf76CEOrqL4D18UtQ96cKXVZjWp2B7oMflMZMzERtisNmRXVlG7y6QIOFkA/Qm TzYg== X-Forwarded-Encrypted: i=1; AJvYcCVmPrW597T/BeWU3Fp/Q55YobjB221XJG2/mBuh1LEDZxPyY9kBL7mdlSfcJF5r8M27tkgvKLGtPg==@kvack.org X-Gm-Message-State: AOJu0YxkyFc2sBdUtoB58nB8XrKHel8JTtuwwkImERGI6pJ1h6OzSaRL edycAPPA8nEKOchiWEoRm22PbIa/jFaijlDKhs6p/90Wi2njPQAOFzZRAlRLv1ixW8JCDkU2PCL 1XmYQZOqCTMrdLw== X-Google-Smtp-Source: AGHT+IGCacv+6M717acz2TK1/3JVC+EAq8qtKNOoyAR0vAmx15aLEyGhA+8TxKSJtahX9auTkLFiUsit2j20Pw8= X-Received: from wmbay23.prod.google.com ([2002:a05:600c:1e17:b0:434:f299:5633]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:3c89:b0:434:a7e7:a1ca with SMTP id 5b1f17b1804b1-438dc40ffaamr172046805e9.20.1738584923172; Mon, 03 Feb 2025 04:15:23 -0800 (PST) Date: Mon, 03 Feb 2025 12:14:40 +0000 In-Reply-To: <20250203-vma-v13-0-2b998268a396@google.com> Mime-Version: 1.0 References: <20250203-vma-v13-0-2b998268a396@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=3164; i=aliceryhl@google.com; h=from:subject:message-id; bh=ACsxo5nqWOfFud7GSbyxer8izfkYzgJ6Pryyu1sTy1I=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnoLNL4o53AT5ZPycfgR+LVXykH5mUGeHNBJb86 PLfYDNydpeJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ6CzSwAKCRAEWL7uWMY5 RiVkEACBiFS6xSL3eKzx6Ld9VzUml20GBqlKuHKgONIlsSMzAChjH+2qe05ePtG6sKfXIaQR9sZ 6iur0KMizrFbW7s66SOtt+cdm+p6Vnn6Jm+IoPzZqOMNZeAEbr7OHiO74Bg/ebwRHevUyCUQ7g+ 1l9CNSBR4la/6gY0ZZYVN/a9EqbCiwpXB6MppK0IOlZblODbko3NbpbCvd9+EMKcVf8M1pYnNgm rgZPCBrAeH2hb9hh1AhYJmD+iWtkVbFNtgkK8crLJahPfED5mGv3ezMmgO7XcZbsAvZh01mp5hb F+v1g99IzTWvWxgDmujStn9sSgS9lG1wRX0LAtY6MejBJZRZ+EZAnyjKvg42G626oq+Rw+S957Q LcOW34tMJGTKNF3PKuMoBzb6OFeeft3zZ3DBhDsJPG0mIX0YiJbYN5+ZGGO41yjBFghFja2Q+ZI W1BUhho9nvX97vcq/7OTzmWvx+aZ3p3Mlm70hNHAyWxqmhZ3lk5EmOKI9JOLSiLGz8aQiTqyPXy UNcZwUe/jh40mxP+WzrKbG/d+lx2DkhWKl1djkyMfavqjFB2zNF/B62kXjxJlrgEvOe0SCy11NM 2JBX2r9rPWoOIL4kLE5W6/emdo/nTlXLmInzZsmBNXGaTikC3bgIBqBIb/VbRetmQY0g1JXVL1Z 9+Xz+7ht6LLxR/g== X-Mailer: b4 0.13.0 Message-ID: <20250203-vma-v13-5-2b998268a396@google.com> Subject: [PATCH v13 5/8] mm: rust: add mmput_async support From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Rspamd-Queue-Id: 6F58920003 X-Stat-Signature: 6ge1jaqje1szrceawrpfgrrjjzhhbn9t X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1738584924-186912 X-HE-Meta: U2FsdGVkX1+wCH+j8jCEin/78txXPZnqfVdeMbAlXwlI1QCXx6uTewCTlBZjgtt+CFrZGa3are/5rN2FebBCxb3KyNhD1TU3PweJfTb5xiabNggJQbOkCEx2SPlSONkCwN5Z68bDpieKA7zlnbADrn2a3N3cjwuNA4aWhX2oyCs6u+S9duPYoB7EcJ8CK9OVPH7zUK04MHhhw3dFqvu42OyY9T9lOrMpm/saiPFw34bTHU7ip6tjraZh/IP4rhKOKxys4hyxNFKYcbnQKBEJeutqMWQGJ5Nj2Fi/jL3StnwiTAgiv+o5LS7UoGINneRsF8d8UjsH0IV27J9xLn8qnOTr9pQoEWR3qLJNQe07dJb2JGI3quli8byshpIMzydcZMGrjrAfLMRWsDEFifL3mpD1BaS3Q2a5zX2OVFpTSEOPUAdRdMH33u4spyDdlny0Yt2sDWjiRMdcqCEnVj9x56AiyRpMUslDJ7PB3AqK6rlKMbIPfnELp/BBgDFlxZUN1uTLEVX4bhQ+mEKfgxq6mtetRGEtascHUtPnAvJj+ylpmmO4P36fSOchW2Glx7fZW47IlfSENTpjtAfuMso5dhPjnE+MKvQ2hDMNs3PbjI8ObpoIIFtDeHd+4BQJca/caILeBwAKY5LEuux6O/HqowBTzgPbLTRv/1bkq1XA9QTrkgiwl1lrPlvuyegsO5upwwJPNGuWJ3psoRA0lWKUc8MMh26k3dzKLiA79Wa72mez5l+dPuhaMWhQc2m1JDM0Y/GnVfJRalsgSXOl3q+tYQNlaqYdZnzDl+jns0gFaPuX4WJKwk9VLKUg/JX15NeHgqYEDgSBcauLrZUCV2Qt6BWKN0yse7+N4mDGv7UWM0D4wzLybUzCr3Z38MLh9EbEwRBRB1zaKkFwGpzrxFJYvgJGWA7IBDu6tDq7dEqVKgGAuGZESCiIgTTbGg5SOrtJ40cowv+AjygdneTxg7e eWRcRDdO Ll5nlRXXpYpCY/rwg8P5zBrzmOn5cLKxD+iG0LcrJY0sMrW+rRspaQA19W0HVVyNEUTCFrA45EVTno1ai+QDEFdSLCbKueO1MRVAJTKsc+WK1ZHKUwJxTq9O6MrIUwTkR94dzSeJWtW6GA7FZvTvfBAdbwJEdpAbc9Z7LRte+Nkv9mu1TzQTkeUuviKNETUwhI6MdSxFhEXXQgY7LFS9d05+mn8YfzBrfLS1gbfpkG+MJdcxUIdzQ+gTN+/nSbaTBewDsjQxq3iqR9KmnCZm1BNi8Ns9+Bvd3ge7nt0Dqe0hSWmj3cFoiTZCg/ShFt3QkrM2t6hSW4didPmBVd49Eh47SmvTXxaGlBbuGwMJBA0Tql/sFiIV42/jX/N57oka9aV1I50YPjXTtUOFdlT2DtKAlNCFri9Tu1t4sMrRLW0KI0TUQlS3TFtnETBS6XmEt7s/bnRMAPmwwr6MOqP46Yk1vate91M4LjgrF2sv0jo4785Q+spl4C2idQiLrDHjEcxNlDCfbad//BdBOERniYJIO+Apy9A2wULd9NZV5PgCRBYn4cUStWioQDhbyeFOOY8xpFZgoCfSDTeGnmBxf1wCi1KFfHRqyZygFff+qYr7zSL7JFJ8b9h4uf8JZza+LyFjNkwXq4hEmHBmvoTqV8mEoYXa8Wud1KHXzWmDy3OZ57MZd8z19gS7hMItLoBw5G/Rl4Oz6/F/UaHvcj9W5p84Kb2hHlTloDoT1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.440840, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Adds an MmWithUserAsync type that uses mmput_async when dropped but is otherwise identical to MmWithUser. This has to be done using a separate type because the thing we are changing is the destructor. Rust Binder needs this to avoid a certain deadlock. See commit 9a9ab0d96362 ("binder: fix race between mmput() and do_exit()") for details. It's also needed in the shrinker to avoid cleaning up the mm in the shrinker's context. Reviewed-by: Andreas Hindborg Acked-by: Lorenzo Stoakes (for mm bits) Signed-off-by: Alice Ryhl --- rust/kernel/mm.rs | 49 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs index 88167c0cbd93..1c43e51c9cca 100644 --- a/rust/kernel/mm.rs +++ b/rust/kernel/mm.rs @@ -110,6 +110,48 @@ fn deref(&self) -> &Mm { } } +/// A wrapper for the kernel's `struct mm_struct`. +/// +/// This type is identical to `MmWithUser` except that it uses `mmput_async` when dropping a +/// refcount. This means that the destructor of `ARef` is safe to call in atomic +/// context. +/// +/// # Invariants +/// +/// Values of this type are always refcounted using `mmget`. The value of `mm_users` is non-zero. +#[repr(transparent)] +pub struct MmWithUserAsync { + mm: MmWithUser, +} + +// SAFETY: It is safe to call `mmput_async` on another thread than where `mmget` was called. +unsafe impl Send for MmWithUserAsync {} +// SAFETY: All methods on `MmWithUserAsync` can be called in parallel from several threads. +unsafe impl Sync for MmWithUserAsync {} + +// SAFETY: By the type invariants, this type is always refcounted. +unsafe impl AlwaysRefCounted for MmWithUserAsync { + fn inc_ref(&self) { + // SAFETY: The pointer is valid since self is a reference. + unsafe { bindings::mmget(self.as_raw()) }; + } + + unsafe fn dec_ref(obj: NonNull) { + // SAFETY: The caller is giving up their refcount. + unsafe { bindings::mmput_async(obj.cast().as_ptr()) }; + } +} + +// Make all `MmWithUser` methods available on `MmWithUserAsync`. +impl Deref for MmWithUserAsync { + type Target = MmWithUser; + + #[inline] + fn deref(&self) -> &MmWithUser { + &self.mm + } +} + // These methods are safe to call even if `mm_users` is zero. impl Mm { /// Returns a raw pointer to the inner `mm_struct`. @@ -161,6 +203,13 @@ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser { unsafe { &*ptr.cast() } } + /// Use `mmput_async` when dropping this refcount. + #[inline] + pub fn into_mmput_async(me: ARef) -> ARef { + // SAFETY: The layouts and invariants are compatible. + unsafe { ARef::from_raw(ARef::into_raw(me).cast()) } + } + /// Attempt to access a vma using the vma read lock. /// /// This is an optimistic trylock operation, so it may fail if there is contention. In that From patchwork Mon Feb 3 12:14:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13957384 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05735C02194 for ; Mon, 3 Feb 2025 12:34:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4CFD1280002; Mon, 3 Feb 2025 07:34:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 47FE7280001; Mon, 3 Feb 2025 07:34:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 32074280002; Mon, 3 Feb 2025 07:34:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 105D9280001 for ; Mon, 3 Feb 2025 07:34:51 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 99CF41226F8 for ; Mon, 3 Feb 2025 12:15:28 +0000 (UTC) X-FDA: 83078528736.15.29371CB Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf23.hostedemail.com (Postfix) with ESMTP id 7C243140011 for ; Mon, 3 Feb 2025 12:15:26 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Sd5hfasb; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of 3XbOgZwkKCIoozwqs5Cvzu22uzs.q20zw18B-00y9oqy.25u@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3XbOgZwkKCIoozwqs5Cvzu22uzs.q20zw18B-00y9oqy.25u@flex--aliceryhl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738584926; a=rsa-sha256; cv=none; b=PxAWMPn+Ly+LhBdFIkVsFL610pI93SB9S1FFco/vmvftCZGb147v11hsgc487Ji89dCrP6 Ut+wReboC8D0MtQSGWqaH7Vabmd7N+te1H+mga9NLPOl87Bem37LYPRiOdUvtA0hjmO/M8 OE1cVU+E2tz9T7NoAq4ivEk+16wImJg= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Sd5hfasb; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of 3XbOgZwkKCIoozwqs5Cvzu22uzs.q20zw18B-00y9oqy.25u@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3XbOgZwkKCIoozwqs5Cvzu22uzs.q20zw18B-00y9oqy.25u@flex--aliceryhl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738584926; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XdP0aoyoQaJRQDN9af3Ddles1mdACdpk6p9g2zjPnwA=; b=w3nYXmlZ5giRWpYd8Mbw27OEZZPZcPD52veZc68LDN4+Fl4OEcJj2wJ1Swc9xqiHMUHEHo l2ooIbshrq0avgk5AF0LL79zVXJ7J3QAND4QGZ6Acc/h9HVkpJqdxOmgGkyEGMKFJjWkwB c5u5keumZFlMZYXB/7h74GCM6UkTX74= Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43651b1ba8aso31206515e9.1 for ; Mon, 03 Feb 2025 04:15:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738584925; x=1739189725; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=XdP0aoyoQaJRQDN9af3Ddles1mdACdpk6p9g2zjPnwA=; b=Sd5hfasbYDIropxxtxxtmMFpZj6aYd9btQdIh5GIpbbvesVjtH6P5xLC4j1+UUuhj0 pCpZickdRFD31+lhtYodJsTf3vs0AGkf6qFwAYAmUZNmhInDfeZctb2WuoXxgStVeoZw LxatXlGxCUi2NCmipjeoUp8HZl2gCIAo7caztoJZfN5smgFcH21TWZpN5lJgkJ8JgKER hPpEpcfHtVnj2LuXWlds2bbRm3YI96Yvb5Fxv7f/LtHE6aTBTdak3Tem+w/7a0HYbJ/W SXvZ69MMwq14W41xmXEt3+rnUzkoqeQfKKD3PlBz9/ixIH/8lTa33TupYS46o0TJrScd kYUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738584925; x=1739189725; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=XdP0aoyoQaJRQDN9af3Ddles1mdACdpk6p9g2zjPnwA=; b=kU8SNo0TuQuLuyxHiqREf1VkROotft2sz1t1tQFz454AMeJNz2D24YZkie2l3QOxqR 3xsIn/hZqcQXNpSlBhIG2DcEr68cT8aID08lNZ0FaUUl5SzMnspa98zO8MLu7ZIpD4VZ 84hDvWM68YlMgpat/P7KasFs3K1+JJ5lDAX5Z1BqCbUzlYVKB5dgzvfH38AmzuFDekET sWLwR3TkD9IUuXgmsL3A+SsBkxnDJfKu0e9OX/mqsifyUJxaBhyPUMp+F08ynv1X3vtU sZyh02MZ61Ntp4Ydk+ICnZt2YnTOCaz1FHrZF9aq//yeDGV0OVwaguPMgYId/NzWaVvn v6VA== X-Forwarded-Encrypted: i=1; AJvYcCVaBaLKwkp45Apzv4H8og62P4dyBk7hOKeJ8VRd8cdnDnSN7B66g20UafzNTc5Fxx3qsxmlUeEgbA==@kvack.org X-Gm-Message-State: AOJu0Yw+A6mUheuTtbCtdghY05a8hihC5USf3MJgD/oc9gxsmVfPjmqT zaNg/jQeiezf9z2WFlNH2wMYzn253QcqnJHQQ8Kw39R1MsHN+G8Px7AabIqwsorZTDyFOMRABWI Evr4Taas7VvDexg== X-Google-Smtp-Source: AGHT+IGvdl8whkyeauBYNUqNHQEJqjDbvwD7qj3CFNWEf5tOxMDBlkY+hQX0TS488w7KyLPMULbm3wB/znwxHGc= X-Received: from wmbay13.prod.google.com ([2002:a05:600c:1e0d:b0:434:f1d0:7dc9]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1ca3:b0:438:a240:c55 with SMTP id 5b1f17b1804b1-438e70d1b20mr96862455e9.1.1738584925233; Mon, 03 Feb 2025 04:15:25 -0800 (PST) Date: Mon, 03 Feb 2025 12:14:41 +0000 In-Reply-To: <20250203-vma-v13-0-2b998268a396@google.com> Mime-Version: 1.0 References: <20250203-vma-v13-0-2b998268a396@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=8625; i=aliceryhl@google.com; h=from:subject:message-id; bh=10j5HcaR1gvtllAVfcw/qjMyimeHVca5LB2wpNGE6ec=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnoLNMGRlshE36N0K5bQLrRASBsrP6ARR+Odgs+ ZRuOZGqHeuJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ6CzTAAKCRAEWL7uWMY5 Ru4uD/wIeJ7LbTTgDF1gzPyLQNw3NUUI95FbT/5jn7qRcfoX1WIuDkkZOzAx1vYs7CH5cQ9owzG Df2TJ9K+7Lnxmo1G8FMuGTYh5EL9q+IY6BVZvKypIv4hcp2Y8CI4BxfLffxKc/5keD7pfrkfA12 QORea0pSlU8TnzABDyWYzGgBGd7pc49b3u58DEEEU1xM6UqVi++Xdme9jaaru5ZK5iGK4Fouk8c LVwgP/uUGgPED0mT8nIKkSY4oENMhC6ITdL3ZdKjIcB7ayb3xKWvLA9N9BNQd2TwawAUO7tcwwd a4YHEQOWCOkx/9Fnw1ajLC7Cq9mfKXSg96PgxyhBb808ZTOXnw+LVM0n2EtqMaK+HH125k8h9L5 DsV5Ip7u9knjAPBBeTiyaSGd6EtGrH7btLvu+vxgJcmbkKQDqPNM38uIcfuzoWV3K+S+o6+nAwa /Hybvrol1Px6fYI0hXEF27zCRnvNxdUyJmNjIMB4oUJ6uuCcrx6g2VpjBUsUUe1xU/CR6YpS2IL IFsf0kRTRS0745iCwntfsNMjQxuaeonqEIXNebvIoZC4CYSMOAUXlsJaYtAFudhrXEWCmWMOCqa 4PuWWNvljKGZJJHiaShbLSBOKTeAvIxcSGkhxMXN287wTwVCttn6nCpJgnyHSAUqwr7rcdhla7m ocD33DZ1yVXouqA== X-Mailer: b4 0.13.0 Message-ID: <20250203-vma-v13-6-2b998268a396@google.com> Subject: [PATCH v13 6/8] mm: rust: add VmAreaNew for f_ops->mmap() From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 7C243140011 X-Stat-Signature: c1hf1szdqedemkihiftrksyj1zfyxpif X-HE-Tag: 1738584926-971939 X-HE-Meta: U2FsdGVkX1+G19IVN5tv+5dx4LD1GSdGzK7Qx4iluNT+PLrDq85iTbqcpk5ogOeVkclReTdJhc8BVRGCgfZzqAvcLEZcthpVmRRvlABu7dxh85LMJ+UJkvPs4OkMmapl2HDPv01A3yPcihTpXzOeCITTlfM/tQjnKFSnd5EWonqdqAPc9QYaSKYYOBUKSbK6lwUBbeOevfN6jtsmKTD20gmOE/6vejoF+m5QjU0Hhtd5gwaSTTJXWGFl6vghCVeysoHE/jY5goIfJBANo4Di67a0k3kPZcCtIBOXXGywVDu/K4YJXzHxPiASwAbUuqHopvV44dC1WX2sPKMx8I/zs02N6L5cyu5VVkMrsU1mNJ1bowcegi+tYgg4CZi4QEXgpk38kJxwhMmXii7gFEkMVBPOl62aPSKCXXS2qRZNGnXgG1wHgz+I1rRGMTTPAUxa2rdwNq7I4GV6pi98Rr6HiH/oChroTggkpdVGymCw0Ab5Wd+c7wv92uNJBmw79IYXbbXPrCaVFrP86wX0zDhqMY2tHcJDNRVFtY/undJjTjT4M8OMhXCrTKJk0bbGUOMGsKgKrAiGeGSYhE4ImX9lbQuYdKdm/4cuzwSd2791wB/EYAFDI61aqbZoABYjzHIccVBe6FcAo7B374tswe1Yf4OnbOh/PAH8SrcdhXnJaO+bO9IqIVeGVf69wbCZh90JGlJzuJCEC5fIo4hzKr818A3kG+JtwqVY/3vsiNp0xaRf7EkTU7O84BMY+Bvqx2zjuBM17ls6geeM7q5BHh4odDFQBd3dNKxfVBaao8/ZFQV0zrDRMHKNoCc1LmNX9IdPe1fKrfId/CEju8SnTf6SYJVjZ+yFs7hTdruaVA4/qvoZN/BhnNskvynq1HafMzyu4Na/RrmWoPSkeZfDRSHGrcnqLgYC78cLCdjW/cQKRnTq8aVHUfwJuA0x47VXcEybnLK9a1xYjzokbED7mXi PHpe+C2m UVbaWw7tXjhc43LpBg7Vut1P9/NeqawDMxq4JjwpbV96iO0nBhaix0/y5yVw63q5KHFu2KuABNnAXNSxNbC9FLRBW7kR/cab1wRTVHilsZz9xVLm/8FECLr4dDJZSDzXOWsyckgGYUSRFV++UC8LUk0J/qUcHauHPid5dMpcc1WMZyzJYUgoPCLZfW4sLTRuOPrErZOeOkF2q6aHak+bCl0pYFVMuvbLAgEepdBJ/keWCb2hukdkEkIhvOKxNJ6EnYvX/t7P5Q4L3kTMNohOeAvP4Vh1huFSpW/JjmQDtdiUORbi7k771DIRvgJFM4iYOy+9CLKm8aeaH82I40uMn0PsQJ7WT+/rlw3FdBWy2TKZIdsqa9IpksUdKXzqBga5eZ+O4RKzwz22Y5uMc0AozuhA6KbooYRHxPGJjDjnUruM5wNiU7ZLUjlJgRAuiV7DQ4ZgmqS9y2PCGA1nB/LIdszTCJ0b8SxpL3QEJ1l2SQ7GkNkokQ/jNCy5N6qvQXfDJfx+wV8opRcILv5CpAEtEzBgkpTMcM5qb9s5tG/Yb0qRFJXj/7AIhZUSLCjwdoUGyb79R/hadQBiGuHnmXIrK530lDmeXA0fdxWOedW87JIkFD1zuQZFFKcsGTJts+kNIVrcvevAmUJyfzjcSt/LieZhwqx6rj79V7xdDpD/HaijVO2N+2c+AXmk6le1UvsZp6Xzo714IxnqcJmS5LRj/WrJkU1RsP9jFdQNW X-Bogosity: Ham, tests=bogofilter, spamicity=0.435372, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This type will be used when setting up a new vma in an f_ops->mmap() hook. Using a separate type from VmAreaRef allows us to have a separate set of operations that you are only able to use during the mmap() hook. For example, the VM_MIXEDMAP flag must not be changed after the initial setup that happens during the f_ops->mmap() hook. To avoid setting invalid flag values, the methods for clearing VM_MAYWRITE and similar involve a check of VM_WRITE, and return an error if VM_WRITE is set. Trying to use `try_clear_maywrite` without checking the return value results in a compilation error because the `Result` type is marked #[must_use]. For now, there's only a method for VM_MIXEDMAP and not VM_PFNMAP. When we add a VM_PFNMAP method, we will need some way to prevent you from setting both VM_MIXEDMAP and VM_PFNMAP on the same vma. Acked-by: Lorenzo Stoakes Reviewed-by: Jann Horn Reviewed-by: Andreas Hindborg Signed-off-by: Alice Ryhl --- rust/kernel/mm/virt.rs | 186 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 185 insertions(+), 1 deletion(-) diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs index 64a0a47070a8..5847fe73db17 100644 --- a/rust/kernel/mm/virt.rs +++ b/rust/kernel/mm/virt.rs @@ -16,7 +16,7 @@ use crate::{ bindings, - error::{to_result, Result}, + error::{code::EINVAL, to_result, Result}, mm::MmWithUser, page::Page, types::Opaque, @@ -203,6 +203,190 @@ pub fn vm_insert_page(&self, address: usize, page: &Page) -> Result { } } +/// A configuration object for setting up a VMA in an `f_ops->mmap()` hook. +/// +/// The `f_ops->mmap()` hook is called when a new VMA is being created, and the hook is able to +/// configure the VMA in various ways to fit the driver that owns it. Using `VmAreaNew` indicates +/// that you are allowed to perform operations on the VMA that can only be performed before the VMA +/// is fully initialized. +/// +/// # Invariants +/// +/// For the duration of 'a, the referenced vma must be undergoing initialization in an +/// `f_ops->mmap()` hook. +pub struct VmAreaNew { + vma: VmAreaRef, +} + +// Make all `VmAreaRef` methods available on `VmAreaNew`. +impl Deref for VmAreaNew { + type Target = VmAreaRef; + + #[inline] + fn deref(&self) -> &VmAreaRef { + &self.vma + } +} + +impl VmAreaNew { + /// Access a virtual memory area given a raw pointer. + /// + /// # Safety + /// + /// Callers must ensure that `vma` is undergoing initial vma setup for the duration of 'a. + #[inline] + pub unsafe fn from_raw<'a>(vma: *mut bindings::vm_area_struct) -> &'a Self { + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a. + unsafe { &*vma.cast() } + } + + /// Internal method for updating the vma flags. + /// + /// # Safety + /// + /// This must not be used to set the flags to an invalid value. + #[inline] + unsafe fn update_flags(&self, set: vm_flags_t, unset: vm_flags_t) { + let mut flags = self.flags(); + flags |= set; + flags &= !unset; + + // SAFETY: This is not a data race: the vma is undergoing initial setup, so it's not yet + // shared. Additionally, `VmAreaNew` is `!Sync`, so it cannot be used to write in parallel. + // The caller promises that this does not set the flags to an invalid value. + unsafe { (*self.as_ptr()).__bindgen_anon_2.__vm_flags = flags }; + } + + /// Set the `VM_MIXEDMAP` flag on this vma. + /// + /// This enables the vma to contain both `struct page` and pure PFN pages. Returns a reference + /// that can be used to call `vm_insert_page` on the vma. + #[inline] + pub fn set_mixedmap(&self) -> &VmAreaMixedMap { + // SAFETY: We don't yet provide a way to set VM_PFNMAP, so this cannot put the flags in an + // invalid state. + unsafe { self.update_flags(flags::MIXEDMAP, 0) }; + + // SAFETY: We just set `VM_MIXEDMAP` on the vma. + unsafe { VmAreaMixedMap::from_raw(self.vma.as_ptr()) } + } + + /// Set the `VM_IO` flag on this vma. + /// + /// This is used for memory mapped IO and similar. The flag tells other parts of the kernel to + /// avoid looking at the pages. For memory mapped IO this is useful as accesses to the pages + /// could have side effects. + #[inline] + pub fn set_io(&self) { + // SAFETY: Setting the VM_IO flag is always okay. + unsafe { self.update_flags(flags::IO, 0) }; + } + + /// Set the `VM_DONTEXPAND` flag on this vma. + /// + /// This prevents the vma from being expanded with `mremap()`. + #[inline] + pub fn set_dontexpand(&self) { + // SAFETY: Setting the VM_DONTEXPAND flag is always okay. + unsafe { self.update_flags(flags::DONTEXPAND, 0) }; + } + + /// Set the `VM_DONTCOPY` flag on this vma. + /// + /// This prevents the vma from being copied on fork. This option is only permanent if `VM_IO` + /// is set. + #[inline] + pub fn set_dontcopy(&self) { + // SAFETY: Setting the VM_DONTCOPY flag is always okay. + unsafe { self.update_flags(flags::DONTCOPY, 0) }; + } + + /// Set the `VM_DONTDUMP` flag on this vma. + /// + /// This prevents the vma from being included in core dumps. This option is only permanent if + /// `VM_IO` is set. + #[inline] + pub fn set_dontdump(&self) { + // SAFETY: Setting the VM_DONTDUMP flag is always okay. + unsafe { self.update_flags(flags::DONTDUMP, 0) }; + } + + /// Returns whether `VM_READ` is set. + /// + /// This flag indicates whether userspace is mapping this vma as readable. + #[inline] + pub fn readable(&self) -> bool { + (self.flags() & flags::READ) != 0 + } + + /// Try to clear the `VM_MAYREAD` flag, failing if `VM_READ` is set. + /// + /// This flag indicates whether userspace is allowed to make this vma readable with + /// `mprotect()`. + /// + /// Note that this operation is irreversible. Once `VM_MAYREAD` has been cleared, it can never + /// be set again. + #[inline] + pub fn try_clear_mayread(&self) -> Result { + if self.readable() { + return Err(EINVAL); + } + // SAFETY: Clearing `VM_MAYREAD` is okay when `VM_READ` is not set. + unsafe { self.update_flags(0, flags::MAYREAD) }; + Ok(()) + } + + /// Returns whether `VM_WRITE` is set. + /// + /// This flag indicates whether userspace is mapping this vma as writable. + #[inline] + pub fn writable(&self) -> bool { + (self.flags() & flags::WRITE) != 0 + } + + /// Try to clear the `VM_MAYWRITE` flag, failing if `VM_WRITE` is set. + /// + /// This flag indicates whether userspace is allowed to make this vma writable with + /// `mprotect()`. + /// + /// Note that this operation is irreversible. Once `VM_MAYWRITE` has been cleared, it can never + /// be set again. + #[inline] + pub fn try_clear_maywrite(&self) -> Result { + if self.writable() { + return Err(EINVAL); + } + // SAFETY: Clearing `VM_MAYWRITE` is okay when `VM_WRITE` is not set. + unsafe { self.update_flags(0, flags::MAYWRITE) }; + Ok(()) + } + + /// Returns whether `VM_EXEC` is set. + /// + /// This flag indicates whether userspace is mapping this vma as executable. + #[inline] + pub fn executable(&self) -> bool { + (self.flags() & flags::EXEC) != 0 + } + + /// Try to clear the `VM_MAYEXEC` flag, failing if `VM_EXEC` is set. + /// + /// This flag indicates whether userspace is allowed to make this vma executable with + /// `mprotect()`. + /// + /// Note that this operation is irreversible. Once `VM_MAYEXEC` has been cleared, it can never + /// be set again. + #[inline] + pub fn try_clear_mayexec(&self) -> Result { + if self.executable() { + return Err(EINVAL); + } + // SAFETY: Clearing `VM_MAYEXEC` is okay when `VM_EXEC` is not set. + unsafe { self.update_flags(0, flags::MAYEXEC) }; + Ok(()) + } +} + /// The integer type used for vma flags. #[doc(inline)] pub use bindings::vm_flags_t; From patchwork Mon Feb 3 12:14:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13957364 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C4C4C02192 for ; Mon, 3 Feb 2025 12:19:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BA2EA280010; Mon, 3 Feb 2025 07:19:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B2BDE28000F; Mon, 3 Feb 2025 07:19:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9A56D280010; Mon, 3 Feb 2025 07:19:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 7958628000F for ; Mon, 3 Feb 2025 07:19:58 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id C1681826DF for ; Mon, 3 Feb 2025 12:15:30 +0000 (UTC) X-FDA: 83078528820.27.03997EC Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf03.hostedemail.com (Postfix) with ESMTP id CD59620006 for ; Mon, 3 Feb 2025 12:15:28 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=kPoKfchb; spf=pass (imf03.hostedemail.com: domain of 3X7OgZwkKCIwq1ysu7Ex1w44w1u.s421y3AD-220Bqs0.47w@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3X7OgZwkKCIwq1ysu7Ex1w44w1u.s421y3AD-220Bqs0.47w@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738584928; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=L5YAFJcoe9aWjiMWck/T3XsvxbQLU+eGKUV8qsjc/6o=; b=uC/edb3z23Ai1phC9CgTi2d/mDOD4dt7hGeMQbiQzOeDxutvr48canquYCX9Nw+8v45Zkt 1Hkqg2qwuIluUNKYv0QgPrszFsD9D6awzCEquki9dq906G7ey2jCd4uWAgnNvu5DGH98oi Fs4V4VlP5dOp3PABOP8K9bHVmYAo/cE= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=kPoKfchb; spf=pass (imf03.hostedemail.com: domain of 3X7OgZwkKCIwq1ysu7Ex1w44w1u.s421y3AD-220Bqs0.47w@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3X7OgZwkKCIwq1ysu7Ex1w44w1u.s421y3AD-220Bqs0.47w@flex--aliceryhl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738584928; a=rsa-sha256; cv=none; b=GhJDL9kV26jt1+X9HQnAer4cevumQS5xxBbDt9lOSo/nsGqhdsYG4JuJrBzQhqeq1fZ8DL M9dZGo12y8k+dWy+CrV9D6ZeeEdLXPM/7y+flIkC5vWXWqfL258f9VICSZo+rUJvQGcIFU ViOvhlccgCbw873km11K7rSIeiNL60U= Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-438e180821aso22305755e9.1 for ; Mon, 03 Feb 2025 04:15:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738584927; x=1739189727; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=L5YAFJcoe9aWjiMWck/T3XsvxbQLU+eGKUV8qsjc/6o=; b=kPoKfchbOZk8p5LrLld3c9pTeLcj4HkK2GnjTR5WjcViHd/+5lcTYz2Y5EoYqno5+N OGdBRKlKHHOYGEu5Hu3aFj+QddllD5rhaJd6UDRBYmkii2vntZ6RY9d+tQFy2C+9j54m S4c5Muex4t2LMdsIsaj2Gyu4NHAp3FkHIGR1FHWtO7TH4N58WJT1PO1uXyjnm8l9Pp7i GRz4paZdasfdHFa8rSmITuAVjpSLgC8/tD8uGYE5F68pyLa1rZZTGKYWfJBLec9JQERz 99CAHlLy796RFZISuDJ8FSbkhdZzrngoFnNel5D8CwkjGfEwBWu47f6llYvWWlX2d9+w gN7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738584927; x=1739189727; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=L5YAFJcoe9aWjiMWck/T3XsvxbQLU+eGKUV8qsjc/6o=; b=k5xTRAkDGaRsAEAXQv/ZryikD9trvkislhKpxECJQjF/023PMTHLZ6k2kMn0EaQcuH Awh9P1mY3aDfNeNhwnpTqGi9lah/06/wyPyN0+4fQ2r8YMJ8onSofnXsdKiY9NeI33zG hWKWhhJbwSp3z3vDpdyGq9oY/BktpVSsI4ajUG2/UGQ5eCcZrrDA9GzTkC/vC6b1oW9m 5TNGA4oYL0LbUrXiDkPBweJfz41ZtTuSPwPQSWRMrDh5mymC1co9qNnQvt9n9F1LriaX SiOMbn4qHDh+p2Vx19klXrMIRASfX1AD5V8vWmL6Ht8RylAS7Y2JDheUFeq45yd4qC+g eQIw== X-Forwarded-Encrypted: i=1; AJvYcCX0ITknrmmEFkCJ8mmQYejQQXULoNS9NGkWA78e+Tn8JVoN3ZbBVhdb21fX/vZxF2Ow6+fWK2aTmg==@kvack.org X-Gm-Message-State: AOJu0YyIy4yQtza567dhfEZgcqS2NjgNxVJRGzDCYgwRRr/ICWxreWp7 ClObf8TdqayATh+JcSwo9Zedrm5A0C6S9FmAc9XYs+Dl5hfRK1WZxq8Sf/0TlEo3gtgJ93CUn4O /MC1dxk36RKXWiw== X-Google-Smtp-Source: AGHT+IGRpqHAm8e8m8G19+Ggf+AlSBBiph12yce/Z46WS7aebCTZGTPSeXVYEmtSCMFpqP+rea7lVywnd8EwYTc= X-Received: from wmdd16.prod.google.com ([2002:a05:600c:a210:b0:434:f9da:44af]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1913:b0:434:ff9d:a3a1 with SMTP id 5b1f17b1804b1-438dc42f5ddmr193300145e9.2.1738584927497; Mon, 03 Feb 2025 04:15:27 -0800 (PST) Date: Mon, 03 Feb 2025 12:14:42 +0000 In-Reply-To: <20250203-vma-v13-0-2b998268a396@google.com> Mime-Version: 1.0 References: <20250203-vma-v13-0-2b998268a396@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=3871; i=aliceryhl@google.com; h=from:subject:message-id; bh=lOxibc3nuZLQi8BiPbjAzGGX8r7u+1l6+AedXlWTNQY=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnoLNNqHQ7JzZkQoXiERXxGW2E5JDKpE49JYLMB Rby6/PrLUOJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ6CzTQAKCRAEWL7uWMY5 RjLXD/95wijzWi48bnFkANglJrcsl0MrU+pyM7kJiSBNfVa3+Ct2A/y+4U1IlJACvWcFHqWajuC mL++v1+KfsbIKUcEfClZDikejXKtig8h2WFaYIQm/E6CYWyJ5grXcti+YF1/e/Ymydu2ZZ5bLZl KVNGyd0+Lp5tdcZ9T701aTiUnvu8O40oFHccZOQBH3oGo7eX7cGxvJtZKW4JUFLCKxWyfT7nbWG oFekPegNJE14QqSjpE3xPaHt/yip9D/Pj/IRd3hO9/C48NCSOJiSDMIExCktPYcoa8tK9iw3wW9 bjaTKYOY43WY6/QDxf9VhyMgCN+vL5VdIDWhlgbzUHPL5/XbQm/QjbtiNTSEPbcHLXBMOWTxMFt MX8mvoeWXJfUzM75hnM7+fVstYgWUNHecMpyfJytxrsBHvEoFaMxzf7WUPrsdtB9drPHSWXYveX aSYxWcLTDVwFHzjjFlWfmh+f40y92zSVPrSfosNfkqUSboZes0qIppDnhDEz0W17izMWFHv2Gxa F68fFyj86L6y+A9PxCnGfnge/0TFRWo1URk3iliqqMQn8PJPJaN2UeU+vAi6tfi9Nf5V+tvKBA6 uMul2VWCKKdy+/rCiIL4dXSTCpU3IQUl3ISSgKnZ9XRlFG1SqxF+m4iqFi+1PHBL9S9S0TbZUjx QyWDwCPrjbpztOg== X-Mailer: b4 0.13.0 Message-ID: <20250203-vma-v13-7-2b998268a396@google.com> Subject: [PATCH v13 7/8] rust: miscdevice: add mmap support From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: CD59620006 X-Stat-Signature: bc8uwbp4b9q8ko8kcybzhj6xajkpxejj X-HE-Tag: 1738584928-555453 X-HE-Meta: U2FsdGVkX1/LY85pH8kt90ZfmCB+XwCG1gal6DWC7EHdvk7CZ20RK13KSuRIsFtbHZTihHoORZtw/uaWwig4vFj4YESuAoZzC3KSEgoxnKcxsJ9jS0leY7lloA79rYWUgQPrGtyagSddX3i9V1WkiFE98tkv3i6ZcMqLuaoWw4yDXjLAQ8eBDUUKKFYUC1GhWn5lvqQZmtwEtdg7K12WeCARqhW2nEFgK1ZzPjEntI/JB3wzxLfkSNWLPiATADbx02NpYrzLAsS4IiynsvrtAfCecAJU1vhszvNkRu2uOTevs8//yTr4iH1QpHkHpQhHdDHTusTPgBSF8btg3kdEp2hsmoNIp95xQTrRgzYAdyDsZPeJEJsG5y5MtOiiDiDkjPIc9hF9gJhO8Ma+8trxjORNXRmQYsUXRyTlN1bXNiy/qvwcdWyQ08u2x7cl3OJUtZUXiLcsJbVAarW6XICdPNyuMJvLL0CQi7T0DSv6t+zeNe1f7I0c1avldBu0eO702vZWCzqn8zZac1TGom9B7Oi0vOUV+SH5KtgAHcq5eWPH+gxz/d6yCOOOLrY1WO428ew77ul/JIQaaVRX3kTvN/kw7uUdlkd20skyaoJ1pTY/6Q/UgKX0KF8GjJ2k7UzvryBCQVMf4oCMQ2QG9b+EYfitx6GTODUae9V4KezYwjEPU7XivK+M/ioKi7TYsVCBjfOoLIlJsRI6dN7V+sRDZPZmhY1zmhD1v4+OscI6vaIj3/B+rMnJan5wY1SBaOWSd3/Q53JX8GraXzVJp5HZapJZJTqVNtcwC59Rbg0b1tnoyar6VyoPF5+0Wc09HTT8oihKNn9lIhXJNOjyXet3tJRu6gNRXu8tZt+PO0eduBmayZOx6ub3/p9C1RmdH9PKPJAwxBxX2L/wWpGpEaNfSD0h7eoy+IwQQnV92Ni6p1HV/WI13pAlyJUP0nvOQv8JoF9OnNfEhHz53m//N3L em02Y7Iy nZG0B0kapGd4K4sf7D4UCiPlXjhBNX+nBctJPV2zGfz1QQKc9koQtt4530a9hXZ0ooYCN45rpcLDVs8+v7fxV7zmq3nOdUD4kMC9pH9pS1yz7F9VSZpC3Waz0IEj87Ltc96W6hBWd8v2v99XhPZR6Lu6gRToRtdqKgQP0e+Yu30vaiWE67y9no104oXwTDvyFnvEptsefASlEw9GnhU+tnODzXe2YjnrDTUAeDeokjW2ARbVVPQ+4Dh40QIPFut8wrUZUQpKzH7sThqDKkuIK9nzXdoTFN+LAMASpiJDCoMfxRy6bB2JAY8ZbZyNW8LogjH0GDuQmFGBsW3eXJlMyOk/apaQKpehKsRTGXbuf9kDlVrUxUFk9/a3mqfJAvLhlg401h2WSLw2hFImd+3+VEW51keFELylaHa8rtIdbdATswteN39QudBMcwdgR5QY8rwdXT3t8eqD0U8ERkxPiB7flN8SI6t/4N3kCltkIa3pH6Qekx5te2bRsErNqgP6Gv2uqPyV+YYRef4d3q4lnmz2KTxgGbVh9J31IsL+s5NfBdI4iP5W1Bb9Ks2NQqFrsUoaAtG+4ed0Rhom+pjTRuq2oh+LTyQ5QmpYhHFG86p2iWIxW5OJVncVhs/Kcf/AGOUkzvLob8B5HyYK0T+TolMIh10zJYOUfELeaSweXJEbSilDKBcn4xEiqj9bEmnblJQvHbMewaSmAnAGFMNkCUw3SBLGLCBK2K2VDJPEnmE0a9ZoSODheADQ/TA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.386983, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add the ability to write a file_operations->mmap hook in Rust when using the miscdevice abstraction. The `vma` argument to the `mmap` hook uses the `VmAreaNew` type from the previous commit; this type provides the correct set of operations for a file_operations->mmap hook. Acked-by: Greg Kroah-Hartman Acked-by: Lorenzo Stoakes Reviewed-by: Andreas Hindborg Signed-off-by: Alice Ryhl --- rust/kernel/miscdevice.rs | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/rust/kernel/miscdevice.rs b/rust/kernel/miscdevice.rs index e14433b2ab9d..97b2384f0816 100644 --- a/rust/kernel/miscdevice.rs +++ b/rust/kernel/miscdevice.rs @@ -14,6 +14,7 @@ error::{to_result, Error, Result, VTABLE_DEFAULT_ERROR}, ffi::{c_int, c_long, c_uint, c_ulong}, fs::File, + mm::virt::VmAreaNew, prelude::*, seq_file::SeqFile, str::CStr, @@ -119,6 +120,22 @@ fn release(device: Self::Ptr, _file: &File) { drop(device); } + /// Handle for mmap. + /// + /// This function is invoked when a user space process invokes the `mmap` system call on + /// `file`. The function is a callback that is part of the VMA initializer. The kernel will do + /// initial setup of the VMA before calling this function. The function can then interact with + /// the VMA initialization by calling methods of `vma`. If the function does not return an + /// error, the kernel will complete initialization of the VMA according to the properties of + /// `vma`. + fn mmap( + _device: ::Borrowed<'_>, + _file: &File, + _vma: &VmAreaNew, + ) -> Result { + kernel::build_error!(VTABLE_DEFAULT_ERROR) + } + /// Handler for ioctls. /// /// The `cmd` argument is usually manipulated using the utilties in [`kernel::ioctl`]. @@ -176,6 +193,7 @@ impl VtableHelper { const VTABLE: bindings::file_operations = bindings::file_operations { open: Some(fops_open::), release: Some(fops_release::), + mmap: maybe_fn(T::HAS_MMAP, fops_mmap::), unlocked_ioctl: maybe_fn(T::HAS_IOCTL, fops_ioctl::), #[cfg(CONFIG_COMPAT)] compat_ioctl: if T::HAS_COMPAT_IOCTL { @@ -257,6 +275,32 @@ impl VtableHelper { 0 } +/// # Safety +/// +/// `file` must be a valid file that is associated with a `MiscDeviceRegistration`. +/// `vma` must be a vma that is currently being mmap'ed with this file. +unsafe extern "C" fn fops_mmap( + file: *mut bindings::file, + vma: *mut bindings::vm_area_struct, +) -> c_int { + // SAFETY: The mmap call of a file can access the private data. + let private = unsafe { (*file).private_data }; + // SAFETY: This is a Rust Miscdevice, so we call `into_foreign` in `open` and `from_foreign` in + // `release`, and `fops_mmap` is guaranteed to be called between those two operations. + let device = unsafe { ::borrow(private) }; + // SAFETY: The caller provides a vma that is undergoing initial VMA setup. + let area = unsafe { VmAreaNew::from_raw(vma) }; + // SAFETY: + // * The file is valid for the duration of this call. + // * There is no active fdget_pos region on the file on this thread. + let file = unsafe { File::from_raw_file(file) }; + + match T::mmap(device, file, area) { + Ok(()) => 0, + Err(err) => err.to_errno(), + } +} + /// # Safety /// /// `file` must be a valid file that is associated with a `MiscDeviceRegistration`. From patchwork Mon Feb 3 12:14:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 13957353 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0EB26C02192 for ; Mon, 3 Feb 2025 12:15:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 97CFB280011; Mon, 3 Feb 2025 07:15:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 92D84280010; Mon, 3 Feb 2025 07:15:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A717280011; Mon, 3 Feb 2025 07:15:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 547C7280010 for ; Mon, 3 Feb 2025 07:15:38 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 1217D1CA236 for ; Mon, 3 Feb 2025 12:15:33 +0000 (UTC) X-FDA: 83078528946.17.BC9A1D2 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf04.hostedemail.com (Postfix) with ESMTP id F3F784000B for ; Mon, 3 Feb 2025 12:15:30 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XpqTDFUY; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 3YbOgZwkKCI4s30uw9Gz3y66y3w.u64305CF-442Dsu2.69y@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3YbOgZwkKCI4s30uw9Gz3y66y3w.u64305CF-442Dsu2.69y@flex--aliceryhl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738584931; a=rsa-sha256; cv=none; b=KAgKZPb4DiY9XfRA1w1AcqyaUjprZs9Gilujeb0sm+BNL1jioPP1QYgzR9pDX8VMpfJdMR V0nGVApcLrsz05ql7m343jo03CUKV7CzR20cZ6urpMFWPm69QHw/US+CE2ZBvtV5sKwBqH B14Hq7YxxbOx/UaXKPg+LMRJHyJnEIw= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XpqTDFUY; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 3YbOgZwkKCI4s30uw9Gz3y66y3w.u64305CF-442Dsu2.69y@flex--aliceryhl.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3YbOgZwkKCI4s30uw9Gz3y66y3w.u64305CF-442Dsu2.69y@flex--aliceryhl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738584931; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tuxJJmxGW3goIRW/OgstRWONs45CMDMWq5FHoZQDXkU=; b=qdG6c/mAKTnh/pZuhl4s4gV89KuiTQZxtJzwDkNux3ANfDBT0nt+9IrZEzX97Tr3YhbEcz Zf2zOW2UvXl8T6Ll/ayfY2ttrCM2DL6qZUSlElUrF6lmEPm2ceQi4DY/zYZqf6Fo//TyZh BjGEkTfMTQUV6oEKJAg/rbvnNOKVa2c= Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-438e180821aso22305995e9.1 for ; Mon, 03 Feb 2025 04:15:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738584929; x=1739189729; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=tuxJJmxGW3goIRW/OgstRWONs45CMDMWq5FHoZQDXkU=; b=XpqTDFUYtccwmctTMwLxZ39T+LoHmDCq40sV/emejZZtK9X52irXlDwjDtxMxkieke /S6eYoCBOTGJYIdCznzZF4UmKvbhYixwjRbAaSZJ7Q4NOIJ2NDhq7rAt+l4Tv8xHkJdr sAnj7vcmnOn0ZaSSQJvpaPN2nfIjTsksoZ8gV+KNiouMjpRu9gbdoYlKvQbbGIVsy5UO s2cYY1IYGyupVEC16U7hbC+kUDEMTP8C+0ZYrKrck63M+mZtw/oUHEm8OGk0ZZr8Fp+x pQB+U8k0lxEGtS1cpdJzoLD9rVd6X9iqQX0cF6O0rM+GSAQghxyxC+yKvscdvAL8v6aG 9CkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738584930; x=1739189730; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=tuxJJmxGW3goIRW/OgstRWONs45CMDMWq5FHoZQDXkU=; b=K34zQ1W0l5sFmFcXKs0iz8yMOMCC0RFmcyvG9LsaxZxqo05L3X6U53l7TVyLp0rOrB Pw5DWatW2eZPNzp7mb86jotzNclvpDDUVsYPoiwSQxFVZLrUmwfLBCsunk74JqlbMlXj Xhy5GIG2/W8h6tYeHWxjpJFn9KU0qnjwDABUVoH6fvkPL6s7iyOvoZgmuDzX0wV8Ip94 bJOLUhSZMuG04+rFfMtsTc5B7/KtuaDMtEokLuaCYE/Oor8l0LnOsccXmvq9S6r7SgCC I2k6jb1JvjnyB2Ie2CgblTFnSKk+J0U3l7UcEeyDs6rS5PDP8et4VLkYNb2MungxN6d/ H3PA== X-Forwarded-Encrypted: i=1; AJvYcCVhOcOhC5JWkdYbVwt10XqDXCUg+WMPCv/ozh/SGx8KyHjT6WNdxsMIrl3hJKxONnbeXoLGqCQ6Ug==@kvack.org X-Gm-Message-State: AOJu0YyPDeYbnrZed2JLV1iS9XN3sm09ohwC8kHDF6U1GU2/okB7xyLc YlYxHZ4/AmFEO5c3X6KEv8wPY1994ge8p4BW079H2/d6FQ6t52WuNiztGmbQu94hC1ZJVFNZkOu Q81qgbgQFT1cMIA== X-Google-Smtp-Source: AGHT+IEtylH7bvRes7JMpNdtE64LFjbyAC2YtB7jplur6VQCzvF0U0vbh8ldT62ynC0a//US8WEfo3Aje7QGZ3Q= X-Received: from wmbhc24.prod.google.com ([2002:a05:600c:8718:b0:436:6fa7:621]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:4e14:b0:438:a214:52f4 with SMTP id 5b1f17b1804b1-438dc428c0amr167641675e9.25.1738584929779; Mon, 03 Feb 2025 04:15:29 -0800 (PST) Date: Mon, 03 Feb 2025 12:14:43 +0000 In-Reply-To: <20250203-vma-v13-0-2b998268a396@google.com> Mime-Version: 1.0 References: <20250203-vma-v13-0-2b998268a396@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=18134; i=aliceryhl@google.com; h=from:subject:message-id; bh=jIwDE0aeb3Nk4rOdUKVsAyevX0pk+IVA+2Vzq2nuP6o=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnoLNNeqAtXmMLVlY0CKjjlEfSuZg8MJHG97rok y5F/UZucyKJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ6CzTQAKCRAEWL7uWMY5 RjDXEACFNur39ucawOYdqsVf0qKC51ENYeHIbq6oH6d1K+wE1T+QGaXNS4KRzfhdvNcT1iqIBGB 1naDOr2c79uEmPeI2jbZE/kvmTgCm7zeUvU0VvdSmq3G52t70d6xodB3iEEu3J8EM9tMLpesY/B +YWmKpSLtDxhLQbZi4iGuJLkcJCUatqUPbnl4XgXDuO1zJl/KNXa0eTvsCza17FLNbmcboLkOuB N0hMY1nX7ubbATe1SMSapf0vKX9q6Sm3PHCfQATwrA6obXVCyi0VDl7S19H/iUk6wRRuXIvpotD lA9FCajNAKrLlcSfVczW5AZU//d52G03jhLSVyjnQ4K9LQLzZe0VfBiLLqeQxckraxr8UkpAJX6 SZtg+7W5nFDOhw4Vd/yB1qBH+7ts6VHfoLyjQABIGO92T4srgZqnpjgDD9eFLScaPxNKEZbNPzU GOScRm348hc9Bl64WlCej0XqnY7HUURRIrEKnGOc2g7vschV3sgf4UMAB0bwP5GuH0SNm8r6Zca 2V4ZGRSgygTtp5T5MJ33i9mB0TxeFiSUNBurnGapesi3y0IwI/iludS9kVwXh4+31Qev2ElRMmx Ks74GDM0El+394QF6hte28v7yoJu93eCRHe2zklBMaVlLjTxf0+a96hXaPqnrs7CV80iIVLQo5E SEl0y3MRD8KGT0A== X-Mailer: b4 0.13.0 Message-ID: <20250203-vma-v13-8-2b998268a396@google.com> Subject: [PATCH v13 8/8] task: rust: rework how current is accessed From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: F3F784000B X-Stat-Signature: a1n7tfcwyj99negrpcfnwatu8po7xatg X-Rspam-User: X-HE-Tag: 1738584930-421701 X-HE-Meta: U2FsdGVkX19PXS3ULqGyD4mfxvbtPFMz1nNBFuHJSSPuyvp9nZ2TQaanF4uzNiG9D8DztcrDuZWfPLQjda5U93q+thky9FzQ+f5LkvIP38AyLMTKrDuqfnTyxRwaM6Ha1ec5LE6bxIAm+cAESs9++JP5yuf+iWP8xwGbQBBgBCDp0ATUpUAWK/BdFuPdJiWft6GYWOJz0IdpTYJfhsWzW2Tb4GWAG4b5EsoKbKhLEltBCuZEzYjLGtPBUSOhlbOKv+urxviGTqATlrW2jfu6I+MRK59dAjbPFufvrDEaP0Yhyhw3nU+zxsbTKR8HBNo5PbwAyxjAWDdkbJp1LxfRhTUsdjxlZDp4rE3AXRojcfAT1SgAXUP9JtWmLS5+VxDDPUJTivWMCZETmTzSpqRkfCyJYXT7ynK5VkPXAuI8aX2mI4JwJHmypWVPh9v7A4rmN5Jqb7Gg2S4sX8Xt2DwctlfiuqERZ8Ewew3hsp/QWIOu87wNppG3qKzYzwbAVDr7Ot26fQ0Mg8dvsGrPQ+0oDlFsC4sDrAsoe6x1MMkSw0T72WEUpLWZ4IWfH365YQMScgqrGUQ0VL8u73r8fR6M0pmAa0dT+1sIyeWsurcGgJ/GkgSxlfC+vZg3m3OiZarmcYS1Xum9Mif2H/vNv8R+rUsg69at5JLglpFFQ/3YYwtdHepW1+l1d6Y7eSEKIRwbRKeAQ+CM9Gro+qOgL4azmEpGVYQeRGi5WOdD17tXDWHaedK8CNEdvYQLtO3VmL+VshC1XerHr+tqN9v0ZF5RurL5x2+diFcx7tS7Ve+Otiecrinw1aQrOC1phyZ6GLuZgbgJ1vs7vWlG60GIR8ySjKcvb82B5TCwsYbpxAFgtay4abog4PoeHSzDrPYXQtCBC/eiuEJWiDIx8JbGiuV69X5O2le4B3RkK/4jh8CNlUpAup4RYrmLbwg6tMWFsqWpNWJ34PYz8vLnZ+oQSCz 5Wl/tmCO s3o7jlMYyKgyXFWBcqbaABYILzEICGL7EAOTMBxinHdwO17zoJf/sDTpmr5lgjxFLwhHh6NRCedqgmCUT2f853wkkAHRifwn74xeks04c7KaPvEsAXprFXV7YUKvDbbIv6ywj146kTpUGq7gayCqSE+9RU/4t1pkunuWfwm16WjueLbO4Mf07G7hanYBI/9fqmI5ULjoPyOF/FNI8CvAzO8ZcdnW+M8c05BaIK3bOgjKurYrC7froDyA++ODOfgutP4p8LUb1bXq9WVNZU0rwSuE2kTUgNlLhvpqoxL6UnoKVyXres+pYEIfeP9/U+UAmJqKd/zzfMxF5fOFac8HtaFaUev5pn6Oh8swpxR5UM5YebQluxze0YBcjdXBsNs/jt2lWCUDcub3Yqd8INH7Rlx4TFi9OoIDI2lZoGw0Sj6C/u/7d4wVWAcOye3ulPGcEeGEv8LmvOIMNCwwTm0AJcNJ+jCjAp2olqJ/dSe3mblNw9gXZXwmRcNzy12BPWd2DnhE6Lmjkj3H0oL+VCKu3rqFlF9ZZJ0FUH1+mLKULvznh9nv6eK3tWUUVhOdYgrrC9baZ8c/PNA5PH0h+hOXqpxP6elOF4Ol4k1nRTKL/NYAy+O0J7Js5/0106c4IBMupHHFNWn5Tc2bhU9+SnajETd9KFDzBoEzyFwuF4vLjO5PkN7quzWf8tiVSVw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.425531, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Introduce a new type called `CurrentTask` that lets you perform various operations that are only safe on the `current` task. Use the new type to provide a way to access the current mm without incrementing its refcount. With this change, you can write stuff such as let vma = current!().mm().lock_vma_under_rcu(addr); without incrementing any refcounts. This replaces the existing abstractions for accessing the current pid namespace. With the old approach, every field access to current involves both a macro and a unsafe helper function. The new approach simplifies that to a single safe function on the `CurrentTask` type. This makes it less heavy-weight to add additional current accessors in the future. That said, creating a `CurrentTask` type like the one in this patch requires that we are careful to ensure that it cannot escape the current task or otherwise access things after they are freed. To do this, I declared that it cannot escape the current "task context" where I defined a "task context" as essentially the region in which `current` remains unchanged. So e.g., release_task() or begin_new_exec() would leave the task context. If a userspace thread returns to userspace and later makes another syscall, then I consider the two syscalls to be different task contexts. This allows values stored in that task to be modified between syscalls, even if they're guaranteed to be immutable during a syscall. Ensuring correctness of `CurrentTask` is slightly tricky if we also want the ability to have a safe `kthread_use_mm()` implementation in Rust. To support that safely, there are two patterns we need to ensure are safe: // Case 1: current!() called inside the scope. let mm; kthread_use_mm(some_mm, || { mm = current!().mm(); }); drop(some_mm); mm.do_something(); // UAF and: // Case 2: current!() called before the scope. let mm; let task = current!(); kthread_use_mm(some_mm, || { mm = task.mm(); }); drop(some_mm); mm.do_something(); // UAF The existing `current!()` abstraction already natively prevents the first case: The `&CurrentTask` would be tied to the inner scope, so the borrow-checker ensures that no reference derived from it can escape the scope. Fixing the second case is a bit more tricky. The solution is to essentially pretend that the contents of the scope execute on an different thread, which means that only thread-safe types can cross the boundary. Since `CurrentTask` is marked `NotThreadSafe`, attempts to move it to another thread will fail, and this includes our fake pretend thread boundary. This has the disadvantage that other types that aren't thread-safe for reasons unrelated to `current` also cannot be moved across the `kthread_use_mm()` boundary. I consider this an acceptable tradeoff. Reviewed-by: Boqun Feng Reviewed-by: Andreas Hindborg Signed-off-by: Alice Ryhl --- rust/kernel/task.rs | 247 +++++++++++++++++++++++++++------------------------- 1 file changed, 129 insertions(+), 118 deletions(-) diff --git a/rust/kernel/task.rs b/rust/kernel/task.rs index 07bc22a7645c..0b6cb9a83a2e 100644 --- a/rust/kernel/task.rs +++ b/rust/kernel/task.rs @@ -7,6 +7,7 @@ use crate::{ bindings, ffi::{c_int, c_long, c_uint}, + mm::MmWithUser, pid_namespace::PidNamespace, types::{ARef, NotThreadSafe, Opaque}, }; @@ -31,22 +32,20 @@ #[macro_export] macro_rules! current { () => { - // SAFETY: Deref + addr-of below create a temporary `TaskRef` that cannot outlive the - // caller. + // SAFETY: This expression creates a temporary value that is dropped at the end of the + // caller's scope. The following mechanisms ensure that the resulting `&CurrentTask` cannot + // leave current task context: + // + // * To return to userspace, the caller must leave the current scope. + // * Operations such as `begin_new_exec()` are necessarily unsafe and the caller of + // `begin_new_exec()` is responsible for safety. + // * Rust abstractions for things such as a `kthread_use_mm()` scope must require the + // closure to be `Send`, so the `NotThreadSafe` field of `CurrentTask` ensures that the + // `&CurrentTask` cannot cross the scope in either direction. unsafe { &*$crate::task::Task::current() } }; } -/// Returns the currently running task's pid namespace. -#[macro_export] -macro_rules! current_pid_ns { - () => { - // SAFETY: Deref + addr-of below create a temporary `PidNamespaceRef` that cannot outlive - // the caller. - unsafe { &*$crate::task::Task::current_pid_ns() } - }; -} - /// Wraps the kernel's `struct task_struct`. /// /// # Invariants @@ -85,7 +84,7 @@ macro_rules! current_pid_ns { /// impl State { /// fn new() -> Self { /// Self { -/// creator: current!().into(), +/// creator: ARef::from(&**current!()), /// index: 0, /// } /// } @@ -105,6 +104,44 @@ unsafe impl Send for Task {} // synchronised by C code (e.g., `signal_pending`). unsafe impl Sync for Task {} +/// Represents the [`Task`] in the `current` global. +/// +/// This type exists to provide more efficient operations that are only valid on the current task. +/// For example, to retrieve the pid-namespace of a task, you must use rcu protection unless it is +/// the current task. +/// +/// # Invariants +/// +/// Each value of this type must only be accessed from the task context it was created within. +/// +/// Of course, every thread is in a different task context, but for the purposes of this invariant, +/// these operations also permanently leave the task context: +/// +/// * Returning to userspace from system call context. +/// * Calling `release_task()`. +/// * Calling `begin_new_exec()` in a binary format loader. +/// +/// Other operations temporarily create a new sub-context: +/// +/// * Calling `kthread_use_mm()` creates a new context, and `kthread_unuse_mm()` returns to the +/// old context. +/// +/// This means that a `CurrentTask` obtained before a `kthread_use_mm()` call may be used again +/// once `kthread_unuse_mm()` is called, but it must not be used between these two calls. +/// Conversely, a `CurrentTask` obtained between a `kthread_use_mm()`/`kthread_unuse_mm()` pair +/// must not be used after `kthread_unuse_mm()`. +#[repr(transparent)] +pub struct CurrentTask(Task, NotThreadSafe); + +// Make all `Task` methods available on `CurrentTask`. +impl Deref for CurrentTask { + type Target = Task; + #[inline] + fn deref(&self) -> &Task { + &self.0 + } +} + /// The type of process identifiers (PIDs). type Pid = bindings::pid_t; @@ -131,119 +168,29 @@ pub fn current_raw() -> *mut bindings::task_struct { /// /// # Safety /// - /// Callers must ensure that the returned object doesn't outlive the current task/thread. - pub unsafe fn current() -> impl Deref { - struct TaskRef<'a> { - task: &'a Task, - _not_send: NotThreadSafe, + /// Callers must ensure that the returned object is only used to access a [`CurrentTask`] + /// within the task context that was active when this function was called. For more details, + /// see the invariants section for [`CurrentTask`]. + pub unsafe fn current() -> impl Deref { + struct TaskRef { + task: *const CurrentTask, } - impl Deref for TaskRef<'_> { - type Target = Task; + impl Deref for TaskRef { + type Target = CurrentTask; fn deref(&self) -> &Self::Target { - self.task + // SAFETY: The returned reference borrows from this `TaskRef`, so it cannot outlive + // the `TaskRef`, which the caller of `Task::current()` has promised will not + // outlive the task/thread for which `self.task` is the `current` pointer. Thus, it + // is okay to return a `CurrentTask` reference here. + unsafe { &*self.task } } } - let current = Task::current_raw(); TaskRef { - // SAFETY: If the current thread is still running, the current task is valid. Given - // that `TaskRef` is not `Send`, we know it cannot be transferred to another thread - // (where it could potentially outlive the caller). - task: unsafe { &*current.cast() }, - _not_send: NotThreadSafe, - } - } - - /// Returns a PidNamespace reference for the currently executing task's/thread's pid namespace. - /// - /// This function can be used to create an unbounded lifetime by e.g., storing the returned - /// PidNamespace in a global variable which would be a bug. So the recommended way to get the - /// current task's/thread's pid namespace is to use the [`current_pid_ns`] macro because it is - /// safe. - /// - /// # Safety - /// - /// Callers must ensure that the returned object doesn't outlive the current task/thread. - pub unsafe fn current_pid_ns() -> impl Deref { - struct PidNamespaceRef<'a> { - task: &'a PidNamespace, - _not_send: NotThreadSafe, - } - - impl Deref for PidNamespaceRef<'_> { - type Target = PidNamespace; - - fn deref(&self) -> &Self::Target { - self.task - } - } - - // The lifetime of `PidNamespace` is bound to `Task` and `struct pid`. - // - // The `PidNamespace` of a `Task` doesn't ever change once the `Task` is alive. A - // `unshare(CLONE_NEWPID)` or `setns(fd_pidns/pidfd, CLONE_NEWPID)` will not have an effect - // on the calling `Task`'s pid namespace. It will only effect the pid namespace of children - // created by the calling `Task`. This invariant guarantees that after having acquired a - // reference to a `Task`'s pid namespace it will remain unchanged. - // - // When a task has exited and been reaped `release_task()` will be called. This will set - // the `PidNamespace` of the task to `NULL`. So retrieving the `PidNamespace` of a task - // that is dead will return `NULL`. Note, that neither holding the RCU lock nor holding a - // referencing count to - // the `Task` will prevent `release_task()` being called. - // - // In order to retrieve the `PidNamespace` of a `Task` the `task_active_pid_ns()` function - // can be used. There are two cases to consider: - // - // (1) retrieving the `PidNamespace` of the `current` task - // (2) retrieving the `PidNamespace` of a non-`current` task - // - // From system call context retrieving the `PidNamespace` for case (1) is always safe and - // requires neither RCU locking nor a reference count to be held. Retrieving the - // `PidNamespace` after `release_task()` for current will return `NULL` but no codepath - // like that is exposed to Rust. - // - // Retrieving the `PidNamespace` from system call context for (2) requires RCU protection. - // Accessing `PidNamespace` outside of RCU protection requires a reference count that - // must've been acquired while holding the RCU lock. Note that accessing a non-`current` - // task means `NULL` can be returned as the non-`current` task could have already passed - // through `release_task()`. - // - // To retrieve (1) the `current_pid_ns!()` macro should be used which ensure that the - // returned `PidNamespace` cannot outlive the calling scope. The associated - // `current_pid_ns()` function should not be called directly as it could be abused to - // created an unbounded lifetime for `PidNamespace`. The `current_pid_ns!()` macro allows - // Rust to handle the common case of accessing `current`'s `PidNamespace` without RCU - // protection and without having to acquire a reference count. - // - // For (2) the `task_get_pid_ns()` method must be used. This will always acquire a - // reference on `PidNamespace` and will return an `Option` to force the caller to - // explicitly handle the case where `PidNamespace` is `None`, something that tends to be - // forgotten when doing the equivalent operation in `C`. Missing RCU primitives make it - // difficult to perform operations that are otherwise safe without holding a reference - // count as long as RCU protection is guaranteed. But it is not important currently. But we - // do want it in the future. - // - // Note for (2) the required RCU protection around calling `task_active_pid_ns()` - // synchronizes against putting the last reference of the associated `struct pid` of - // `task->thread_pid`. The `struct pid` stored in that field is used to retrieve the - // `PidNamespace` of the caller. When `release_task()` is called `task->thread_pid` will be - // `NULL`ed and `put_pid()` on said `struct pid` will be delayed in `free_pid()` via - // `call_rcu()` allowing everyone with an RCU protected access to the `struct pid` acquired - // from `task->thread_pid` to finish. - // - // SAFETY: The current task's pid namespace is valid as long as the current task is running. - let pidns = unsafe { bindings::task_active_pid_ns(Task::current_raw()) }; - PidNamespaceRef { - // SAFETY: If the current thread is still running, the current task and its associated - // pid namespace are valid. `PidNamespaceRef` is not `Send`, so we know it cannot be - // transferred to another thread (where it could potentially outlive the current - // `Task`). The caller needs to ensure that the PidNamespaceRef doesn't outlive the - // current task/thread. - task: unsafe { PidNamespace::from_ptr(pidns) }, - _not_send: NotThreadSafe, + // CAST: The layout of `struct task_struct` and `CurrentTask` is identical. + task: Task::current_raw().cast(), } } @@ -326,6 +273,70 @@ pub fn wake_up(&self) { } } +impl CurrentTask { + /// Access the address space of the current task. + /// + /// This function does not touch the refcount of the mm. + #[inline] + pub fn mm(&self) -> Option<&MmWithUser> { + // SAFETY: The `mm` field of `current` is not modified from other threads, so reading it is + // not a data race. + let mm = unsafe { (*self.as_ptr()).mm }; + + if mm.is_null() { + return None; + } + + // SAFETY: If `current->mm` is non-null, then it references a valid mm with a non-zero + // value of `mm_users`. Furthermore, the returned `&MmWithUser` borrows from this + // `CurrentTask`, so it cannot escape the scope in which the current pointer was obtained. + // + // This is safe even if `kthread_use_mm()`/`kthread_unuse_mm()` are used. There are two + // relevant cases: + // * If the `&CurrentTask` was created before `kthread_use_mm()`, then it cannot be + // accessed during the `kthread_use_mm()`/`kthread_unuse_mm()` scope due to the + // `NotThreadSafe` field of `CurrentTask`. + // * If the `&CurrentTask` was created within a `kthread_use_mm()`/`kthread_unuse_mm()` + // scope, then the `&CurrentTask` cannot escape that scope, so the returned `&MmWithUser` + // also cannot escape that scope. + // In either case, it's not possible to read `current->mm` and keep using it after the + // scope is ended with `kthread_unuse_mm()`. + Some(unsafe { MmWithUser::from_raw(mm) }) + } + + /// Access the pid namespace of the current task. + /// + /// This function does not touch the refcount of the namespace or use RCU protection. + /// + /// To access the pid namespace of another task, see [`Task::get_pid_ns`]. + #[doc(alias = "task_active_pid_ns")] + #[inline] + pub fn active_pid_ns(&self) -> Option<&PidNamespace> { + // SAFETY: It is safe to call `task_active_pid_ns` without RCU protection when calling it + // on the current task. + let active_ns = unsafe { bindings::task_active_pid_ns(self.as_ptr()) }; + + if active_ns.is_null() { + return None; + } + + // The lifetime of `PidNamespace` is bound to `Task` and `struct pid`. + // + // The `PidNamespace` of a `Task` doesn't ever change once the `Task` is alive. + // + // From system call context retrieving the `PidNamespace` for the current task is always + // safe and requires neither RCU locking nor a reference count to be held. Retrieving the + // `PidNamespace` after `release_task()` for current will return `NULL` but no codepath + // like that is exposed to Rust. + // + // SAFETY: If `current`'s pid ns is non-null, then it references a valid pid ns. + // Furthermore, the returned `&PidNamespace` borrows from this `CurrentTask`, so it cannot + // escape the scope in which the current pointer was obtained, e.g. it cannot live past a + // `release_task()` call. + Some(unsafe { PidNamespace::from_ptr(active_ns) }) + } +} + // SAFETY: The type invariants guarantee that `Task` is always refcounted. unsafe impl crate::types::AlwaysRefCounted for Task { fn inc_ref(&self) {