From patchwork Mon Apr 29 17:04:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Groves X-Patchwork-Id: 13647384 Received: from mail-ot1-f43.google.com (mail-ot1-f43.google.com [209.85.210.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2F96E83CBA; Mon, 29 Apr 2024 17:04:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410291; cv=none; b=RYssHjhPwGCNkwmuA6KenDNRs9+bzhC/udvL32R5pZ5PGVTMpBBx+GkyyIkKaNr2ggAnoCzf6+nNylMfMwCUk4jdtq0QqmZui2hXHbqfVskwOI96KvgZyT2yhckDAqv8Q/NTgsliMR6d3YhhRNEPN6sShaTXVXS8qwKy/LPB44w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410291; c=relaxed/simple; bh=DnBFz4/Se6xVWS99x9m7QEYEtuOZb/pR9I7lm8YTTnM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Civhfd/xUp8Y+kHtyIwVglMmkT8FXuaC2biHN0IcSTcmB/CVO+vLjVHsE9tW9MXWDqfxJx/amO35/zMrau3j0/vRQiVjCdZzGSMb35vHBJknr+AF5hK4KlLXUaGHCl08ZPiJvrOOsOM3pg1IjDf50yZehIYqjIfrVXiiAEAIk+Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=PaJd04Kb; arc=none smtp.client-ip=209.85.210.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="PaJd04Kb" Received: by mail-ot1-f43.google.com with SMTP id 46e09a7af769-6ee4dcc4567so421157a34.3; Mon, 29 Apr 2024 10:04:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714410288; x=1715015088; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=jhow96kP5BU3qcQBMxGZEfW9W+QoKa4rPIJ24CtlxHk=; b=PaJd04KbR5FB+9mASYbIUsOhlPihkwX31rS/6ph1LWrSNYhkABaI86bIpT8aPvqgWo WJNOMbbbtBqMCMPREPsBtWheUixLtZzEjWGm0BaAWOBi8RuYH6oKLU9mhSe/WqNfohtn 7c4fQnKlrFFIOugXbdf8avOeOCKXVqrAGSN8C19BwUOkpzbAeNNZf7AKTXfDwqXcJRFV hV49VV+PXLvjZwwfmnnUIPvMCvS4iymZ6uaR4VakZlY6lvAdqypXuSqW2k/eexYwNkW+ T0NlKggUfQJgW3Jf1yrYQTsO2tL6rI7w/MAkvEHY48sABPd/D5GODtetDRn3hYFIBspk s/SA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714410288; x=1715015088; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=jhow96kP5BU3qcQBMxGZEfW9W+QoKa4rPIJ24CtlxHk=; b=m4IiAHeuiO6RPzvosGtdLOMFrO5yvgVY+ZpXaYsJ0dnAoSbtiE73MqXkHEeRjCCNao CrxI3rMA9vN2H7+2RafgexHM90IC4WI/4XOZi85+seI6nQV9hELIRQgqWWO8ygtATCw6 B0W76ISjDqNsaOdVoZV7pszo8HAJ3JvqPwaf/HADby4csC8ZIeGkSPA31LDYm6DKUo/q sBetlqu9Yg9yf05uzk3KCW5cln51f2rlZ4DkTUjQ5eBvrE6ByisR1/MGzFD8N4xQpjs7 eJsWV0FPVRLClOiXY2XK7xUoxvXNTg/CBg09FOb5CHsyvR2SFrXz/vp0gcuLyHMQZVzD p+fg== X-Forwarded-Encrypted: i=1; AJvYcCWYHdzwFPft+o9GWxUb5buYFJfk3bq/2RVn3rlAoEK8IXkO8LSYVlfz/fbjb/Doe5ZkMSEW6kJPtS5aJV0EGVRypoiQQoo8dSz66Cq5e15leGMekTezM8Q8AWUBFxdukBKqfui3Ao+9cA== X-Gm-Message-State: AOJu0Yx/PeTmLAogCW0H9YcZlkbahBLMPmUH4wYhQZTq5q/nDeWiT8Tf R7K5bKaMPUewo/CAF9oKHrpt5bJPcYT68PHympcCTPVkJPhXF5M1 X-Google-Smtp-Source: AGHT+IFT0esK+nrgP4Zf50tdR4YgA9tgrSYLGG5TA0ZvSJostVvZqlCI2BgehfoxtpiJFe8p6wC3NQ== X-Received: by 2002:a05:6830:19cc:b0:6eb:d847:ff8a with SMTP id p12-20020a05683019cc00b006ebd847ff8amr8346147otp.9.1714410288151; Mon, 29 Apr 2024 10:04:48 -0700 (PDT) Received: from localhost.localdomain ([70.114.203.196]) by smtp.gmail.com with ESMTPSA id g1-20020a9d6201000000b006ea20712e66sm4074448otj.17.2024.04.29.10.04.45 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 29 Apr 2024 10:04:47 -0700 (PDT) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Jonathan Corbet , Jonathan Cameron , Dan Williams , Vishal Verma , Dave Jiang , Alexander Viro , Christian Brauner , Jan Kara , Matthew Wilcox , linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev Cc: John Groves , john@jagalactic.com, Dave Chinner , Christoph Hellwig , dave.hansen@linux.intel.com, gregory.price@memverge.com, Randy Dunlap , Jerome Glisse , Aravind Ramesh , Ajay Joshi , Eishan Mirakhur , Ravi Shankar , Srinivasulu Thanneeru , Luis Chamberlain , Amir Goldstein , Chandan Babu R , Bagas Sanjaya , "Darrick J . Wong" , Kent Overstreet , Steve French , Nathan Lynch , Michael Ellerman , Thomas Zimmermann , Julien Panis , Stanislav Fomichev , Dongsheng Yang , John Groves Subject: [RFC PATCH v2 01/12] famfs: Introduce famfs documentation Date: Mon, 29 Apr 2024 12:04:17 -0500 Message-Id: <0270b3e2d4c6511990978479771598ad62cf2ddd.1714409084.git.john@groves.net> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 * Introduce Documentation/filesystems/famfs.rst into the Documentation tree and filesystems index * Add famfs famfs.rst to the filesystems doc index * Add famfs' ioctl opcodes to ioctl-number.rst * Update MAINTAINERS FILE Signed-off-by: John Groves Reviewed-by: Bagas Sanjaya --- Documentation/filesystems/famfs.rst | 135 ++++++++++++++++++ Documentation/filesystems/index.rst | 1 + .../userspace-api/ioctl/ioctl-number.rst | 1 + MAINTAINERS | 9 ++ 4 files changed, 146 insertions(+) create mode 100644 Documentation/filesystems/famfs.rst diff --git a/Documentation/filesystems/famfs.rst b/Documentation/filesystems/famfs.rst new file mode 100644 index 000000000000..792785598d6a --- /dev/null +++ b/Documentation/filesystems/famfs.rst @@ -0,0 +1,135 @@ +.. SPDX-License-Identifier: GPL-2.0 + +.. _famfs_index: + +================================================================== +famfs: The kernel component of the famfs shared memory file system +================================================================== + +- Copyright (C) 2024 Micron Technology, Inc. + +Introduction +============ +Compute Express Link (CXL) provides a mechanism for disaggregated or +fabric-attached memory (FAM). This creates opportunities for data sharing; +clustered apps that would otherwise have to shard or replicate data can +share one copy in disaggregated memory. + +Famfs, which is not CXL-specific in any way, provides a mechanism for +multiple hosts to use data in shared memory, by giving it a file system +interface. With famfs, any app that understands files (which is almost +all apps) can access data sets in shared memory. Although famfs +supports read and write, the real point is to support mmap, which +provides direct (dax) access to the memory - either writable or read-only. + +Shared memory can pose complex coherency and synchronization issues, but +there are also simple cases. Two simple and eminently useful patterns that +occur frequently in data analytics and AI are: + +* Serial Sharing - Only one host or process at a time has access to a file +* Read-only Sharing - Multiple hosts or processes share read-only access + to a file + +The famfs kernel file system is part of the famfs framework; User space +components [1] handle metadata allocation and distribution, and direct the +famfs kernel module to instantiate files that map to specific memory. + +The famfs framework manages coherency of its own metadata and structures, +but does not attempt to manage coherency for applications. + +Famfs also provides data isolation between files. That is, even though +the host has access to an entire memory "device" (as a dax device), apps +cannot write to memory for which the file is read-only, and mapping one +file provides isolation from the memory of all other files. This is pretty +basic, but some experimental shared memory usage patterns provide no such +isolation. + +Principles of Operation +======================= + +Without its user space components, the famfs kernel module doesn't do +anything useful. The user space components maintain superblocks and +metadata logs, and use the famfs kernel component to provide a file system +view of shared memory across multiple hosts. + +Each host has an independent instance of the famfs kernel module. After +mount, files are not visible until the user space component instantiates +them (normally by playing the famfs metadata log). + +Once instantiated, files on each host can point to the same shared memory, +but in-memory metadata (inodes, etc.) is ephemeral on each host that has a +famfs instance mounted. Like ramfs, the famfs in-kernel file system has no +backing store for metadata modifications. If metadata mutations are ever +persisted, that must be done by the user space components. However, +mutations to file data are saved to the shared memory - subject to write +permission and processor cache behavior. + + +Famfs is Not a Conventional File System +--------------------------------------- + +Famfs files can be accessed by conventional means, but there are +limitations. The kernel component of famfs is not involved in the +allocation of backing memory for files at all; the famfs user space +creates files and passes the allocation extent lists into the kernel via +the per-file FAMFSIOC_MAP_CREATE ioctl. A file that lacks this metadata is +treated as invalid by the famfs kernel module. As a practical matter files +must be created via the famfs library or cli, but they can be consumed as +if they were conventional files. + +Famfs differs in some important ways from conventional file systems: + +* Files must be pre-allocated by the famfs framework; Allocation is never + performed on (or after) write. +* Any operation that changes a file's size is considered to put the file + in an invalid state, disabling access to the data. It may be possible to + revisit this in the future. (Typically the famfs user space can restore + files to a valid state by replaying the famfs metadata log.) + +Famfs exists to apply the existing file system abstractions to shared +memory so applications and workflows can more easily adapt to an +environment with disaggregated shared memory. + +Memory Error Handling +===================== + +Possible memory errors include timeouts, poison and unexpected +reconfiguration of an underlying dax device. In all of these cases, famfs +receives a call via its iomap_ops->notify_failure() function. If any +memory errors have been detected, Access to the affected famfs mount is +disabled to avoid further errors or corruption. Testing indicates that +a famfs instance that has encountered errors can be unmounted cleanly, but +Repairing memory errors or corruption is outside the scope of famfs. + +Key Requirements +================ + +The primary requirements for famfs are: + +1. Must support a file system abstraction backed by sharable dax memory +2. Files must efficiently handle VMA faults +3. Must support metadata distribution in a sharable way +4. Must handle clients with a stale copy of metadata + +The famfs kernel component takes care of 1-2 above by caching each file's +mapping metadata in the kernel. + +Requirements 3 and 4 are handled by the user space components, and are +largely orthogonal to the functionality of the famfs kernel module. + +Requirements 3 and 4 cannot be met by conventional fs-dax file systems +(e.g. xfs and ext4) because they use write-back metadata; it is not valid +to mount such a file system on two hosts from the same in-memory image. + + +Famfs Usage +=========== + +Famfs usage is documented at [1]. + + +References +========== + +- [1] Famfs user space repository and documentation + https://github.com/cxl-micron-reskit/famfs diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystems/index.rst index 1f9b4c905a6a..0fe2c70a106f 100644 --- a/Documentation/filesystems/index.rst +++ b/Documentation/filesystems/index.rst @@ -87,6 +87,7 @@ Documentation for filesystem implementations. ext3 ext4/index f2fs + famfs gfs2 gfs2-uevents gfs2-glocks diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst index c472423412bf..ac407802cf10 100644 --- a/Documentation/userspace-api/ioctl/ioctl-number.rst +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst @@ -289,6 +289,7 @@ Code Seq# Include File Comments 'u' 00-1F linux/smb_fs.h gone 'u' 20-3F linux/uvcvideo.h USB video class host driver 'u' 40-4f linux/udmabuf.h userspace dma-buf misc device +'u' 50-5F linux/famfs_ioctl.h famfs shared memory file system 'v' 00-1F linux/ext2_fs.h conflict! 'v' 00-1F linux/fs.h conflict! 'v' 00-0F linux/sonypi.h conflict! diff --git a/MAINTAINERS b/MAINTAINERS index ebf03f5f0619..3f2d847dcf01 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8180,6 +8180,15 @@ F: Documentation/networking/failover.rst F: include/net/failover.h F: net/core/failover.c +FAMFS +M: John Groves +M: John Groves +M: John Groves +L: linux-cxl@vger.kernel.org +L: linux-fsdevel@vger.kernel.org +S: Supported +F: Documentation/filesystems/famfs.rst + FANOTIFY M: Jan Kara R: Amir Goldstein From patchwork Mon Apr 29 17:04:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Groves X-Patchwork-Id: 13647385 Received: from mail-ot1-f53.google.com (mail-ot1-f53.google.com [209.85.210.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CBD6283CBA; Mon, 29 Apr 2024 17:04:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410296; cv=none; b=cb2pfr41QqwLcgYf9kqY9sA41iVUTrMqmnvsg/5tpeKIp0m2eMuEjp46lViEWHOTdcpjMfbA0dhQTl8UaqPGtoi+dwyhFaJIWWoPTFNhv0FoyMHeHPjwareSMqdPu6Mtf6NqCe946eJLWrJk3fyIsIWAkwQoTEByWp07csdUXuM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410296; c=relaxed/simple; bh=WMjz9J531Kmle80GG+B28JoC2U8bxRD8YtN4xwrFy4U=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=hZr0xWTpiwBUFTU4u8G+W5kS/VZlpfvoh5d9c2J1BBLxIVajMqGSqw6ffiewXYnPN5mrR2XMkiKzTq8IHza1vMCApdsQtsBPAi4bd5qX0OBXZojosDdZ0zuYUjhhlvu3fLlTHSxUHGgRKn7U9gzIDdiXqzd+G/zQhGMCgKo2agg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=c4OoeUyo; arc=none smtp.client-ip=209.85.210.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="c4OoeUyo" Received: by mail-ot1-f53.google.com with SMTP id 46e09a7af769-6ea2ac4607aso2546447a34.3; Mon, 29 Apr 2024 10:04:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714410293; x=1715015093; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=yFzk9HeKkizf0o48ibh6kc3Q60VfzVMF4+JwNubBTrc=; b=c4OoeUyohJX81A95HlwPlHb2xBrXn0h8iaGZ+SSwWbAUT6jUNNlnkAQ0NoTak7wltG J+UGWUogef6HIUnGOl5EIDc26SiOT0OJ+Hf/cxY3Wsj6Hy+hDObXy9I1ubYKLMYV0Si0 NutvVcHkhXNPXTfEtQYLGyn2iQM2++VLcH+QwaZr9XIcERGan1aNRsR5uksB5OHCjdMR 3xS4wHZcchncHBhCEJiQj8jEab8nkG6AC2hXLrVkjmxsgy/Yz1VwZcl9ev5+xHVmDpG/ ueDoOC2iCkfRifxlTGSSQJvyQw2pM4diOyaepwvbaPMrSpP4NzIVHnlBOe8/EaJ/Jr3T rnOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714410293; x=1715015093; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=yFzk9HeKkizf0o48ibh6kc3Q60VfzVMF4+JwNubBTrc=; b=F3Y0ltsh5Wn7QAWAT0hJroTnFPnhR/oE6QK9JLSIM/wORJTtSHLqwIYO8ukuiFtzpl eBId6IhcdZ666/p5O8LGzCBeElmFm9jlODZ8g/IYO/b6vdQBrXx34KJheRhIICNmnzH9 9Ze4gRiCv8Z/7f8A86zwLPZMFXTOF0Z1maQFgrqlMnBRw5ZoEo584pc/m+xXs1UBiVSJ ydjpMrU3SnGFJmK/wHkDs9tj31PAGwhdvI5Rms/gIBZDLgiPP2T+GL2d+msRQbSorU8I niASvbD3XSD7XTm8/BnpCXMf/pZ/sZ+e4dLUmyb8wTs3huiqCVWGiJB14KArOoDKu8Ij rcow== X-Forwarded-Encrypted: i=1; AJvYcCVz1slq/qfWas9nqp1OlC2lX/aKXzwblwQnffbeqYTutG8PwasWvaCrHBa6pDDRd1ZS3hTKOEfXRIVE6NUmvkaesoQS7p1K4sT3OcNgA44T6LcFg06i44lYTWhNRc1g/0QLChfWrtrfOA== X-Gm-Message-State: AOJu0YxWB78NVNqK0oeH5PwIMsnvdskekbI4eP5ZqB4lxPk5eef+w4Qv 1SukKfzNfP5zqVzGTfmzeZhpm1mgALZHHlnHahf7BV1OzptVSmgm X-Google-Smtp-Source: AGHT+IH1LmxDBYpYdV8qfbKzAt1ETJgRjycszT1zAA1fNj6kpgVTUiKrSaeYD5ARjcBkfK0lW4EGeg== X-Received: by 2002:a05:6830:4a2:b0:6eb:7c52:fd19 with SMTP id l2-20020a05683004a200b006eb7c52fd19mr12585723otd.16.1714410292768; Mon, 29 Apr 2024 10:04:52 -0700 (PDT) Received: from localhost.localdomain ([70.114.203.196]) by smtp.gmail.com with ESMTPSA id g1-20020a9d6201000000b006ea20712e66sm4074448otj.17.2024.04.29.10.04.49 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 29 Apr 2024 10:04:52 -0700 (PDT) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Jonathan Corbet , Jonathan Cameron , Dan Williams , Vishal Verma , Dave Jiang , Alexander Viro , Christian Brauner , Jan Kara , Matthew Wilcox , linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev Cc: John Groves , john@jagalactic.com, Dave Chinner , Christoph Hellwig , dave.hansen@linux.intel.com, gregory.price@memverge.com, Randy Dunlap , Jerome Glisse , Aravind Ramesh , Ajay Joshi , Eishan Mirakhur , Ravi Shankar , Srinivasulu Thanneeru , Luis Chamberlain , Amir Goldstein , Chandan Babu R , Bagas Sanjaya , "Darrick J . Wong" , Kent Overstreet , Steve French , Nathan Lynch , Michael Ellerman , Thomas Zimmermann , Julien Panis , Stanislav Fomichev , Dongsheng Yang , John Groves Subject: [RFC PATCH v2 02/12] dev_dax_iomap: Move dax_pgoff_to_phys() from device.c to bus.c Date: Mon, 29 Apr 2024 12:04:18 -0500 Message-Id: <552c86dd6c3c4252994a94e23bad2cb95e3ed392.1714409084.git.john@groves.net> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 No changes to the function - just moved it. dev_dax_iomap needs to call this function from drivers/dax/bus.c. drivers/dax/bus.c can't call functions in drivers/dax/device.c - that creates a circular linkage dependency - but device.c can call functions in bus.c. Also exports dax_pgoff_to_phys() since both bus.c and device.c now call it. Signed-off-by: John Groves --- drivers/dax/bus.c | 24 ++++++++++++++++++++++++ drivers/dax/device.c | 23 ----------------------- 2 files changed, 24 insertions(+), 23 deletions(-) diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index 797e1ebff299..f894272beab8 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -1447,6 +1447,30 @@ static const struct device_type dev_dax_type = { .groups = dax_attribute_groups, }; +/* see "strong" declaration in tools/testing/nvdimm/dax-dev.c */ +__weak phys_addr_t dax_pgoff_to_phys(struct dev_dax *dev_dax, pgoff_t pgoff, + unsigned long size) +{ + int i; + + for (i = 0; i < dev_dax->nr_range; i++) { + struct dev_dax_range *dax_range = &dev_dax->ranges[i]; + struct range *range = &dax_range->range; + unsigned long long pgoff_end; + phys_addr_t phys; + + pgoff_end = dax_range->pgoff + PHYS_PFN(range_len(range)) - 1; + if (pgoff < dax_range->pgoff || pgoff > pgoff_end) + continue; + phys = PFN_PHYS(pgoff - dax_range->pgoff) + range->start; + if (phys + size - 1 <= range->end) + return phys; + break; + } + return -1; +} +EXPORT_SYMBOL_GPL(dax_pgoff_to_phys); + static struct dev_dax *__devm_create_dev_dax(struct dev_dax_data *data) { struct dax_region *dax_region = data->dax_region; diff --git a/drivers/dax/device.c b/drivers/dax/device.c index 93ebedc5ec8c..40ba660013cf 100644 --- a/drivers/dax/device.c +++ b/drivers/dax/device.c @@ -50,29 +50,6 @@ static int check_vma(struct dev_dax *dev_dax, struct vm_area_struct *vma, return 0; } -/* see "strong" declaration in tools/testing/nvdimm/dax-dev.c */ -__weak phys_addr_t dax_pgoff_to_phys(struct dev_dax *dev_dax, pgoff_t pgoff, - unsigned long size) -{ - int i; - - for (i = 0; i < dev_dax->nr_range; i++) { - struct dev_dax_range *dax_range = &dev_dax->ranges[i]; - struct range *range = &dax_range->range; - unsigned long long pgoff_end; - phys_addr_t phys; - - pgoff_end = dax_range->pgoff + PHYS_PFN(range_len(range)) - 1; - if (pgoff < dax_range->pgoff || pgoff > pgoff_end) - continue; - phys = PFN_PHYS(pgoff - dax_range->pgoff) + range->start; - if (phys + size - 1 <= range->end) - return phys; - break; - } - return -1; -} - static void dax_set_mapping(struct vm_fault *vmf, pfn_t pfn, unsigned long fault_size) { From patchwork Mon Apr 29 17:04:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Groves X-Patchwork-Id: 13647386 Received: from mail-ot1-f54.google.com (mail-ot1-f54.google.com [209.85.210.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8610F83CBA; Mon, 29 Apr 2024 17:04:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410300; cv=none; b=HehtC/EPuX5qIUxdCl7atJ4irhLDpsOJByXNSGwpas5sErGWW2TRB5zx+LOmcIOMkGEXuaTIJbzqPtzZcllDJZMmLKKnemy99KAi+9F9jTZY05+WZRXb8a3cCI3H+gNMiePh6KZxF1moVSzFtGAu4ATnReKuj5h9fQ4mUwUppW0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410300; c=relaxed/simple; bh=rgBnb8kNelGFoEuF+9gSifpAE2erEJQ3IXeaCnjnJIE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=h7ADVe84Sed5OJUgHi9wvzafWSbMscWUNeSh4r11/DsPe8lYChiBi8T20UVQjWrYEkqBcvH/6gcQXY/h4PRaDWhhCc3t07nzLPWv7b7yFzDdcDhwDSNMfrMBy+CrLyp8FjU1Vo6q2mYQID1A+4dWyWrML9ieWRc7mpigIZZxFz4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ZBydDiH+; arc=none smtp.client-ip=209.85.210.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ZBydDiH+" Received: by mail-ot1-f54.google.com with SMTP id 46e09a7af769-6eb86b69e65so2877268a34.3; Mon, 29 Apr 2024 10:04:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714410298; x=1715015098; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=cbJ8300Bg7dObjiydXebhYyYyzlw5PN8FKXGfdtJkHo=; b=ZBydDiH+fA/1bP3GZ/RcllDl6TKKA/IqcI/ldfXDPMtB66NjqqDETmCs4rxRnbo0J/ 1MAsuWkB5wVn7NI8VFJRS3nWu4xn4MQO1PiqAswRkzDVInHOoQ6sr3maGMWu0Kt12Fwo Apc7h2D9GfH6p4EPy2CgeaEBfUO8u2tyFHXVyBhR66RslDDPxZz4KC9LN7H0K4ig/zec NQcu9VnGM6qYDbzKn1IgzneOrNKBEdgzQhyr3cqb+JzODummA3NU+jxIYpwXRdOZCS8O SCHyaPDYEbWwcbnSmc6bMqcBlt4CfXdNTAM5Xs0MWWjpcGsNKqEFiEMdTSiZA61f2S9f qnyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714410298; x=1715015098; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=cbJ8300Bg7dObjiydXebhYyYyzlw5PN8FKXGfdtJkHo=; b=Z9/frp7Q/PgckP5Apm5lqhjp6y64X7C/rXqyFucW5RLmmzCoLHz9JTVrFh/aqZ0MZS umO5l8vKR9qoRycF3gVN/ke3+XtwffRJ6zJFGKphmKe3NNW+dL6Od+cNZ/mSWcyfSK1G g4S80GLw/rK6aR3FAYc76kwn8OTiXAYOPMSMc71sS1cQEKHHtSp+v0fY/0g+T3ts4uDR TRIYsdlqInFTO/uIPwBiPfYnqa93b5ABMJ54AWdP/ffJ3Hn9JFg65iAi2C743AqrW2MR Nzfmaz7Q2yYzK+lXPLajxw7N7u2YXGUiWf+wP6HU4JmR7eBwyZ/Wvn5ebY0MqwwVLv/V dvrw== X-Forwarded-Encrypted: i=1; AJvYcCUE0npJG1qODwyF8l+5wIC3zPdZXCerE/PMjgNBR1e7N94beEBpmNGHbES/uIHcQPAiTDiLqyy7g14PjcRPALrdplzms5Dqc732pOGsTefIj/XXr5CXn3ptDMNzHq5HzbfZfn6nCyahcg== X-Gm-Message-State: AOJu0YyPabATQIVRJW35GIySFLR1Rqf4SfoO1fwzk3J5oq1vyiDB/FAz estuxQPFM6jFtVH3wgv5lyzRu4daaLlCbmWwzdYr6A9T+TWrtRqY X-Google-Smtp-Source: AGHT+IGTSUZCw51+UtLT9AMWjJRVxBGYIxlCGo0X8H6A2GGnenuf0ViFidr/sXJeRaXyY8aUr+2TQw== X-Received: by 2002:a9d:7981:0:b0:6ee:2d1e:10f9 with SMTP id h1-20020a9d7981000000b006ee2d1e10f9mr4520928otm.15.1714410297664; Mon, 29 Apr 2024 10:04:57 -0700 (PDT) Received: from localhost.localdomain ([70.114.203.196]) by smtp.gmail.com with ESMTPSA id g1-20020a9d6201000000b006ea20712e66sm4074448otj.17.2024.04.29.10.04.54 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 29 Apr 2024 10:04:57 -0700 (PDT) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Jonathan Corbet , Jonathan Cameron , Dan Williams , Vishal Verma , Dave Jiang , Alexander Viro , Christian Brauner , Jan Kara , Matthew Wilcox , linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev Cc: John Groves , john@jagalactic.com, Dave Chinner , Christoph Hellwig , dave.hansen@linux.intel.com, gregory.price@memverge.com, Randy Dunlap , Jerome Glisse , Aravind Ramesh , Ajay Joshi , Eishan Mirakhur , Ravi Shankar , Srinivasulu Thanneeru , Luis Chamberlain , Amir Goldstein , Chandan Babu R , Bagas Sanjaya , "Darrick J . Wong" , Kent Overstreet , Steve French , Nathan Lynch , Michael Ellerman , Thomas Zimmermann , Julien Panis , Stanislav Fomichev , Dongsheng Yang , John Groves Subject: [RFC PATCH v2 03/12] dev_dax_iomap: Add fs_dax_get() func to prepare dax for fs-dax usage Date: Mon, 29 Apr 2024 12:04:19 -0500 Message-Id: X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This function should be called by fs-dax file systems after opening the devdax device. This adds holder_operations, which effects exclusivity between callers of fs_dax_get(). This function serves the same role as fs_dax_get_by_bdev(), which dax file systems call after opening the pmem block device. This also adds the CONFIG_DEV_DAX_IOMAP Kconfig parameter Signed-off-by: John Groves --- drivers/dax/Kconfig | 6 ++++++ drivers/dax/super.c | 30 ++++++++++++++++++++++++++++++ include/linux/dax.h | 5 +++++ 3 files changed, 41 insertions(+) diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig index a88744244149..b1ebcc77120b 100644 --- a/drivers/dax/Kconfig +++ b/drivers/dax/Kconfig @@ -78,4 +78,10 @@ config DEV_DAX_KMEM Say N if unsure. +config DEV_DAX_IOMAP + depends on DEV_DAX && DAX + def_bool y + help + Support iomap mapping of devdax devices (for FS-DAX file + systems that reside on character /dev/dax devices) endif diff --git a/drivers/dax/super.c b/drivers/dax/super.c index aca71d7fccc1..4b55f79849b0 100644 --- a/drivers/dax/super.c +++ b/drivers/dax/super.c @@ -122,6 +122,36 @@ void fs_put_dax(struct dax_device *dax_dev, void *holder) EXPORT_SYMBOL_GPL(fs_put_dax); #endif /* CONFIG_BLOCK && CONFIG_FS_DAX */ +#if IS_ENABLED(CONFIG_DEV_DAX_IOMAP) +/** + * fs_dax_get() + * + * fs-dax file systems call this function to prepare to use a devdax device for + * fsdax. This is like fs_dax_get_by_bdev(), but the caller already has struct + * dev_dax (and there * is no bdev). The holder makes this exclusive. + * + * @dax_dev: dev to be prepared for fs-dax usage + * @holder: filesystem or mapped device inside the dax_device + * @hops: operations for the inner holder + * + * Returns: 0 on success, <0 on failure + */ +int fs_dax_get(struct dax_device *dax_dev, void *holder, + const struct dax_holder_operations *hops) +{ + if (!dax_dev || !dax_alive(dax_dev) || !igrab(&dax_dev->inode)) + return -ENODEV; + + if (cmpxchg(&dax_dev->holder_data, NULL, holder)) + return -EBUSY; + + dax_dev->holder_ops = hops; + + return 0; +} +EXPORT_SYMBOL_GPL(fs_dax_get); +#endif /* DEV_DAX_IOMAP */ + enum dax_device_flags { /* !alive + rcu grace period == no new operations / mappings */ DAXDEV_ALIVE, diff --git a/include/linux/dax.h b/include/linux/dax.h index 9d3e3327af4c..4a86716f932a 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -57,6 +57,11 @@ struct dax_holder_operations { #if IS_ENABLED(CONFIG_DAX) struct dax_device *alloc_dax(void *private, const struct dax_operations *ops); + +#if IS_ENABLED(CONFIG_DEV_DAX_IOMAP) +int fs_dax_get(struct dax_device *dax_dev, void *holder, const struct dax_holder_operations *hops); +struct dax_device *inode_dax(struct inode *inode); +#endif void *dax_holder(struct dax_device *dax_dev); void put_dax(struct dax_device *dax_dev); void kill_dax(struct dax_device *dax_dev); From patchwork Mon Apr 29 17:04:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Groves X-Patchwork-Id: 13647387 Received: from mail-oa1-f42.google.com (mail-oa1-f42.google.com [209.85.160.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B2D1783CBA; Mon, 29 Apr 2024 17:05:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410304; cv=none; b=CzgzEJ7jmVub9Yrb2FmAhkh64C9JTAOCfLugMYk00IDeVuhQk3csPB4Bbb3fEPE8qacYWZOzWVL0u1Xnl50WdCB7w65lHi1pSzsozomcQhyOLdyVhTNQTGxjKP1THXEP7oPbBf8BFA5mZh5ZrDHAYULW6dWvLbOoPCF7xnM98ZA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410304; c=relaxed/simple; bh=MF/nGBkB6SU7LTC2q8o8OtAo1cFLHGL5yhdgeaFsrRw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=IwKGU/CwuqsxGMNV8czlk+dBnQQPaFZSDoOS/WONjexFp0bi0UKRxr5TlUVkbPgczm4VhmGGLRPNfksEYmKQzniy4HqgKzsibPK09cEJ6dxuL63paE3nMJ7fFmAoZ6UQVEoPZtZf4jvUitUoCUxG+dFDr439zmRsMKTxYpUG+Eg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=noS+FCZf; arc=none smtp.client-ip=209.85.160.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="noS+FCZf" Received: by mail-oa1-f42.google.com with SMTP id 586e51a60fabf-233e41de0caso2419874fac.0; Mon, 29 Apr 2024 10:05:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714410302; x=1715015102; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=BpFg6AryW/38Pe75AIWbtaiD3q42kDFfLzhRRyJ/+BY=; b=noS+FCZfnjEr1mJkWyrlqMcMg4Lc26M2FVS3V8ATUE/GwYS3tokAGOBi2ll/xJSjWj rozvlDkTCcAgLYTarVLWs5A9QC2IAcFUS9MgFEOYQxdWIOqAPxAoRXAKYZQ+VAhL2b15 Ok7o60wZKQxkN/Ik5+Ug4s4Z8SBh3649cbb0A1fNQeeVC6EjEtdsdCZzWsX6KjJvfkEB VnpEvMSyvyFQwkLEAhvZHMwlR7pXLMGMTY+KmChs3aJ6EYPQ3CVVkH04uC+/eRQx8TH4 UK1FMjzbacWLpAwc8WFf9dAAGqCcoqlhhtgf2G9afihGf4lst42vkhB6n5bNg1zsQjnO hFfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714410302; x=1715015102; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=BpFg6AryW/38Pe75AIWbtaiD3q42kDFfLzhRRyJ/+BY=; b=kSH2ZKW9L89oRuoQ8mLxugrB+W1Pih4qWUcWzkD7vHfZ1mZnaGsQxyWFxEt8/UP82Q 0Ya4FWnDxVlI8mEdFjrCnu984I/Tj6n6I63lfywCXf2zYBI8EamsZY819/GDTNBU5SEK Y61GEewrr3O2EkUt8wpuHhK2d12xMKIgGEc+k8xrwoAm3YqbkHnBAHDmPqT+RwbuiC0T aLc7mb/bKl1FjBZg+ZCN0SROuL0KVsy/HKVvkulbXASujye5WxHWAomhI7G1hnQQHhbV qHeTgMA/q0nabXZwb7ZpCWJll8cTjTbMKlrmO7DwIQVFsYG7bm4ymvFhYiwL5mwBOeDJ Cu3g== X-Forwarded-Encrypted: i=1; AJvYcCWKA09u7LVoE99y5RqSQ9fqWcV/ez6hKygMwEGnlkvcu6ng/ZeaHoYqO+uEDf/JLbhp44beXJm0kzEoipkEX3wWgPl5SV1c7UsxdvqTwIKmh/Mc9vZXt6SdPbahphdcoeb2X5Qga5QKWQ== X-Gm-Message-State: AOJu0Ywi8/8uxqy5hYFQVfRKpHrluG23zu7ElsxGv5XgHDCNuI1cdpAu o0L7qDO/nYRyu/3cbFvJMKSSGoghOVrtRjO4lFJQNxatckD9D0Wh X-Google-Smtp-Source: AGHT+IFVJywrgMfv62xcHp0KCFD6JQ/TzPkywYZ5Go7vHf3u7Bi5WE1VU9aozhj5LAVxYHRlvKWs8w== X-Received: by 2002:a05:6870:224f:b0:23c:ad86:9935 with SMTP id j15-20020a056870224f00b0023cad869935mr3417616oaf.45.1714410301811; Mon, 29 Apr 2024 10:05:01 -0700 (PDT) Received: from localhost.localdomain ([70.114.203.196]) by smtp.gmail.com with ESMTPSA id g1-20020a9d6201000000b006ea20712e66sm4074448otj.17.2024.04.29.10.04.59 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 29 Apr 2024 10:05:01 -0700 (PDT) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Jonathan Corbet , Jonathan Cameron , Dan Williams , Vishal Verma , Dave Jiang , Alexander Viro , Christian Brauner , Jan Kara , Matthew Wilcox , linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev Cc: John Groves , john@jagalactic.com, Dave Chinner , Christoph Hellwig , dave.hansen@linux.intel.com, gregory.price@memverge.com, Randy Dunlap , Jerome Glisse , Aravind Ramesh , Ajay Joshi , Eishan Mirakhur , Ravi Shankar , Srinivasulu Thanneeru , Luis Chamberlain , Amir Goldstein , Chandan Babu R , Bagas Sanjaya , "Darrick J . Wong" , Kent Overstreet , Steve French , Nathan Lynch , Michael Ellerman , Thomas Zimmermann , Julien Panis , Stanislav Fomichev , Dongsheng Yang , John Groves Subject: [RFC PATCH v2 04/12] dev_dax_iomap: Save the kva from memremap Date: Mon, 29 Apr 2024 12:04:20 -0500 Message-Id: X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Save the kva from memremap because we need it for iomap rw support. Prior to famfs, there were no iomap users of /dev/dax - so the virtual address from memremap was not needed. Also: in some cases dev_dax_probe() is called with the first dev_dax->range offset past the start of pgmap[0].range. In those cases we need to add the difference to virt_addr in order to have the physaddr's in dev_dax->ranges match dev_dax->virt_addr. This happens with devdax devices that started as pmem and got converted to devdax. I'm not sure whether the offset is due to label storage, or page tables, but this works in all known cases. Signed-off-by: John Groves --- drivers/dax/dax-private.h | 1 + drivers/dax/device.c | 15 +++++++++++++++ 2 files changed, 16 insertions(+) diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h index 446617b73aea..df5b3d975df4 100644 --- a/drivers/dax/dax-private.h +++ b/drivers/dax/dax-private.h @@ -63,6 +63,7 @@ struct dax_mapping { struct dev_dax { struct dax_region *region; struct dax_device *dax_dev; + void *virt_addr; unsigned int align; int target_node; bool dyn_id; diff --git a/drivers/dax/device.c b/drivers/dax/device.c index 40ba660013cf..17323b5f6f57 100644 --- a/drivers/dax/device.c +++ b/drivers/dax/device.c @@ -372,6 +372,7 @@ static int dev_dax_probe(struct dev_dax *dev_dax) struct dax_device *dax_dev = dev_dax->dax_dev; struct device *dev = &dev_dax->dev; struct dev_pagemap *pgmap; + u64 data_offset = 0; struct inode *inode; struct cdev *cdev; void *addr; @@ -426,6 +427,20 @@ static int dev_dax_probe(struct dev_dax *dev_dax) if (IS_ERR(addr)) return PTR_ERR(addr); + /* Detect whether the data is at a non-zero offset into the memory */ + if (pgmap->range.start != dev_dax->ranges[0].range.start) { + u64 phys = dev_dax->ranges[0].range.start; + u64 pgmap_phys = dev_dax->pgmap[0].range.start; + u64 vmemmap_shift = dev_dax->pgmap[0].vmemmap_shift; + + if (!WARN_ON(pgmap_phys > phys)) + data_offset = phys - pgmap_phys; + + pr_debug("%s: offset detected phys=%llx pgmap_phys=%llx offset=%llx shift=%llx\n", + __func__, phys, pgmap_phys, data_offset, vmemmap_shift); + } + dev_dax->virt_addr = addr + data_offset; + inode = dax_inode(dax_dev); cdev = inode->i_cdev; cdev_init(cdev, &dax_fops); From patchwork Mon Apr 29 17:04:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Groves X-Patchwork-Id: 13647388 Received: from mail-ot1-f45.google.com (mail-ot1-f45.google.com [209.85.210.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B2D7783CBA; Mon, 29 Apr 2024 17:05:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410309; cv=none; b=ESgUzPazgIj3A65Tq6ZksJLEJCf7mAwaAVeHWhg7dmUILGlL9bnXIMcCATPRs38P0ZJ9aM83fazymSzfdOYQKSkjATxsdXp9PnU3l4vcvRIa7due/iIXp3E4j1eMfIEKdsPqtcivnirqCF903Emytoop1GomK1Ho7xVFsvDoiEA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410309; c=relaxed/simple; bh=qjg/aYIA2Aj3ETwT1bnW30oXWKlF7YzrJy/VAO7WdCo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=GNSfVgGy7QLFfgxROEbrxYP1CAi1PspfGUm1JCDcQEcCR4Fh9+aMf3r/Zru1E0vZY04vrtPEdx2hJeAr8lQgYxWc7CcCupdXbV8ztC+UX2L2P9J5EXSiEP0p0cRTby5/SL7Xgn8dw9WLDCsDi3qm7fANUPq7BNjvT0hLNZ/+s6E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=XcL7GLws; arc=none smtp.client-ip=209.85.210.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="XcL7GLws" Received: by mail-ot1-f45.google.com with SMTP id 46e09a7af769-6ee2d64423cso897008a34.2; Mon, 29 Apr 2024 10:05:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714410307; x=1715015107; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=rbyzPWFSpme/6Xxw3JGZg5PIHNbasBNXPhuMHV0OHPQ=; b=XcL7GLws32HprpZsVn28q7eMCHa4jryFb6uc84tDWQdKItFmiMCm7p87gtb29I2PSf FYaxHE1x7eqw+1nmhF/x6LcDHmpi/XT6YPcZMhyi3Ze1e2pNh3HT3tRhUE74e11mvrt9 tc+HEOUDtyjV9ivc6ghvTAIQd7w6ehADRPbjbdM2DQC9PlblE7T/8XaCV+WdLwO51Ib7 95O2yqwKOWx3qHuTCSK1jh2qxJryqGjv+O5vNRbamYdCPBUeCekJ9CQjvT0KuPpFwGWM 2aCYL7oPkxauPnOSQQEVR5oTrlTdQvfj5tml44fQP7Abt0FTaiXIiHwFoOINJrUd5lVX As5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714410307; x=1715015107; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=rbyzPWFSpme/6Xxw3JGZg5PIHNbasBNXPhuMHV0OHPQ=; b=nmkK93FgnlBjd1eItKhGFLExDE4BUB9m/6gZHq/P/cEsZMu7vvolNu4ppsS5vtRM/+ dBecIipB2yUeQLWGO7XrUdT77DZ1KV6hp52SmPUqYVIwfHWAdw6SnC81COHlwUiiuk0n HTU/EAGExMpT7KZc1e74ngQ3hj2YJW2N+TSW+0b3HWmjpz94nmnyX6irZzrze/D5imeF QA63UXyYvIzR44xZMrBvloSJHV5nYWbPajonxbjIAFNjZ5gaR4iMHGR4RF0h9lCb82Oa PWnWFE0L+YXQrcGbLhT/VqTdcLyaqN68dTEc5xHy670n0PNYqehoiGf9hTD8JaqSZZyi DggQ== X-Forwarded-Encrypted: i=1; AJvYcCXGijvxik9yZT5O5Wi9IZylIT2VyuXsjIOLqowW6XK5NzKsddy54AlhKCuvYnXNBjXmKAj86cSGnei0ygwwFd7dubZWNVH7Dx8osS6ac/0Zd7+dN9FDdNuHG0xKsmjKnAtjAEkHKsg78w== X-Gm-Message-State: AOJu0Yx3EfbF0AfLU3l8/3PFXyLIgut/Velxut505Z0SiILBHa1VjbLi r4ZmKA116rWfKMYW3tKJxdYcoWQyST3zk7ZphYEy42UIjLC6Lz0Q X-Google-Smtp-Source: AGHT+IGjO7mh1AyfnujisR4C9AKwUM+jghociccM9M53gOQe5r8ngqJlfGiDhTWRhS5N4AbdDyMemw== X-Received: by 2002:a05:6830:20cd:b0:6ee:32f0:ec4e with SMTP id z13-20020a05683020cd00b006ee32f0ec4emr4182582otq.31.1714410306753; Mon, 29 Apr 2024 10:05:06 -0700 (PDT) Received: from localhost.localdomain ([70.114.203.196]) by smtp.gmail.com with ESMTPSA id g1-20020a9d6201000000b006ea20712e66sm4074448otj.17.2024.04.29.10.05.04 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 29 Apr 2024 10:05:06 -0700 (PDT) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Jonathan Corbet , Jonathan Cameron , Dan Williams , Vishal Verma , Dave Jiang , Alexander Viro , Christian Brauner , Jan Kara , Matthew Wilcox , linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev Cc: John Groves , john@jagalactic.com, Dave Chinner , Christoph Hellwig , dave.hansen@linux.intel.com, gregory.price@memverge.com, Randy Dunlap , Jerome Glisse , Aravind Ramesh , Ajay Joshi , Eishan Mirakhur , Ravi Shankar , Srinivasulu Thanneeru , Luis Chamberlain , Amir Goldstein , Chandan Babu R , Bagas Sanjaya , "Darrick J . Wong" , Kent Overstreet , Steve French , Nathan Lynch , Michael Ellerman , Thomas Zimmermann , Julien Panis , Stanislav Fomichev , Dongsheng Yang , John Groves Subject: [RFC PATCH v2 05/12] dev_dax_iomap: Add dax_operations for use by fs-dax on devdax Date: Mon, 29 Apr 2024 12:04:21 -0500 Message-Id: <2a8b926ce25a9ef242c933fa451b29401e62bb37.1714409084.git.john@groves.net> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Notes about this commit: * These methods are based on pmem_dax_ops from drivers/nvdimm/pmem.c * dev_dax_direct_access() is returns the hpa, pfn and kva. The kva was newly stored as dev_dax->virt_addr by dev_dax_probe(). * The hpa/pfn are used for mmap (dax_iomap_fault()), and the kva is used for read/write (dax_iomap_rw()) * dev_dax_recovery_write() and dev_dax_zero_page_range() have not been tested yet. I'm looking for suggestions as to how to test those. Signed-off-by: John Groves --- drivers/dax/bus.c | 120 ++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 115 insertions(+), 5 deletions(-) diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index f894272beab8..9c57d4139b74 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -7,6 +7,10 @@ #include #include #include +#include +#include +#include +#include #include "dax-private.h" #include "bus.h" @@ -1471,6 +1475,105 @@ __weak phys_addr_t dax_pgoff_to_phys(struct dev_dax *dev_dax, pgoff_t pgoff, } EXPORT_SYMBOL_GPL(dax_pgoff_to_phys); +#if IS_ENABLED(CONFIG_DEV_DAX_IOMAP) + +static void write_dax(void *pmem_addr, struct page *page, + unsigned int off, unsigned int len) +{ + unsigned int chunk; + void *mem; + + while (len) { + mem = kmap_local_page(page); + chunk = min_t(unsigned int, len, PAGE_SIZE - off); + memcpy_flushcache(pmem_addr, mem + off, chunk); + kunmap_local(mem); + len -= chunk; + off = 0; + page++; + pmem_addr += chunk; + } +} + +static long __dev_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, + long nr_pages, enum dax_access_mode mode, void **kaddr, + pfn_t *pfn) +{ + struct dev_dax *dev_dax = dax_get_private(dax_dev); + size_t size = nr_pages << PAGE_SHIFT; + size_t offset = pgoff << PAGE_SHIFT; + void *virt_addr = dev_dax->virt_addr + offset; + u64 flags = PFN_DEV|PFN_MAP; + phys_addr_t phys; + pfn_t local_pfn; + size_t dax_size; + + WARN_ON(!dev_dax->virt_addr); + + if (down_read_interruptible(&dax_dev_rwsem)) + return 0; /* no valid data since we were killed */ + dax_size = dev_dax_size(dev_dax); + up_read(&dax_dev_rwsem); + + phys = dax_pgoff_to_phys(dev_dax, pgoff, nr_pages << PAGE_SHIFT); + + if (kaddr) + *kaddr = virt_addr; + + local_pfn = phys_to_pfn_t(phys, flags); /* are flags correct? */ + if (pfn) + *pfn = local_pfn; + + /* This the valid size at the specified address */ + return PHYS_PFN(min_t(size_t, size, dax_size - offset)); +} + +static int dev_dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff, + size_t nr_pages) +{ + long resid = nr_pages << PAGE_SHIFT; + long offset = pgoff << PAGE_SHIFT; + + /* Break into one write per dax region */ + while (resid > 0) { + void *kaddr; + pgoff_t poff = offset >> PAGE_SHIFT; + long len = __dev_dax_direct_access(dax_dev, poff, + nr_pages, DAX_ACCESS, &kaddr, NULL); + len = min_t(long, len, PAGE_SIZE); + write_dax(kaddr, ZERO_PAGE(0), offset, len); + + offset += len; + resid -= len; + } + return 0; +} + +static long dev_dax_direct_access(struct dax_device *dax_dev, + pgoff_t pgoff, long nr_pages, enum dax_access_mode mode, + void **kaddr, pfn_t *pfn) +{ + return __dev_dax_direct_access(dax_dev, pgoff, nr_pages, mode, kaddr, pfn); +} + +static size_t dev_dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff, + void *addr, size_t bytes, struct iov_iter *i) +{ + size_t off; + + off = offset_in_page(addr); + + return _copy_from_iter_flushcache(addr, bytes, i); +} + +static const struct dax_operations dev_dax_ops = { + .direct_access = dev_dax_direct_access, + .zero_page_range = dev_dax_zero_page_range, + .recovery_write = dev_dax_recovery_write, +}; + +#endif /* IS_ENABLED(CONFIG_DEV_DAX_IOMAP) */ + static struct dev_dax *__devm_create_dev_dax(struct dev_dax_data *data) { struct dax_region *dax_region = data->dax_region; @@ -1526,11 +1629,18 @@ static struct dev_dax *__devm_create_dev_dax(struct dev_dax_data *data) } } - /* - * No dax_operations since there is no access to this device outside of - * mmap of the resulting character device. - */ - dax_dev = alloc_dax(dev_dax, NULL); + if (IS_ENABLED(CONFIG_DEV_DAX_IOMAP)) + /* holder_ops currently populated separately in a slightly + * hacky way + */ + dax_dev = alloc_dax(dev_dax, &dev_dax_ops); + else + /* + * No dax_operations since there is no access to this device + * outside of mmap of the resulting character device. + */ + dax_dev = alloc_dax(dev_dax, NULL); + if (IS_ERR(dax_dev)) { rc = PTR_ERR(dax_dev); goto err_alloc_dax; From patchwork Mon Apr 29 17:04:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Groves X-Patchwork-Id: 13647389 Received: from mail-ot1-f41.google.com (mail-ot1-f41.google.com [209.85.210.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8BCDB126F1C; Mon, 29 Apr 2024 17:05:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410313; cv=none; b=J3WprfBzHm/X/T6XlA4c9k7XBmjVo2oKXWBglvTzV2+IXWwGp5rj+0Ij++Rv3xcshy6Z/Rk5HkZGF2Sr7Ev3yRNfXizeOn792ry1Wm1WxH7tcQHuUbEOGODyT5xQmsO+jUEx3QSvnALmGO3441CLJPOdtRXRxLeY9qWBsd/FVPU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410313; c=relaxed/simple; bh=Dq3IXVMAl3LwKTvbPabCyz/Wl0bJyojQgrN3Tsz/e+8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=UlWXzP3niVa8P0zYC8il+wwg0t9oaBZq4Cn+1RCEfiVVYYIo0pae/P7PnbhOQqEqTeEg/QB4/BQZDoKy1S4CMUIq6Jyo35XIAzX3iodv6vzRYcB3hYY1xCpMkexIKhMMD2n82dJQE7aSaU9gB5vNxclo7pkHlWVXp2tM0UuYXB4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=BvXm4fFJ; arc=none smtp.client-ip=209.85.210.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="BvXm4fFJ" Received: by mail-ot1-f41.google.com with SMTP id 46e09a7af769-6ee5b2de791so62801a34.3; Mon, 29 Apr 2024 10:05:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714410310; x=1715015110; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=lyJucMVndmC161Z9DZrkXcMb4iGiZ+dSnJlqt7NeU2c=; b=BvXm4fFJ1g9CvjQrP86azOyWfSPNl6q//i5bb/mL8CSU+UtVsFqZPaFINk6VG6sMsN 4cOG9W01f/VZOnp6AZTNgmE9tuKcKYi6I0VHpxg5bJyB2Sbh+lZRtO9pj0+rSkE5H4gw tiERlStM9LSUwBZN/xP9qCP8h1QIkBnvr4qT50j18LATH5TO1UtK4dqkWRdKku3gK9Vw 0n2nJkZ00qHsROso20XMbb5KYGNst/0+q4mZniPu2rldyicy4pR3ObxWfRHy5Oy5gSdh nKd/HE6pWcNovuYAa0E2HDF5YVve4yUNtIOOuWCryWDYxSVasRSN3bzeAvLKTq8lGbMb A0eA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714410310; x=1715015110; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=lyJucMVndmC161Z9DZrkXcMb4iGiZ+dSnJlqt7NeU2c=; b=Zxloyw7gXnER6/M9+Y/kpVcWM5PARW2gco/nEjq4kvxght+EHriGXVHBUFNnjYq3b7 eCLQSejhXGVqfGjbPVB9tmT9nge0v3qZYk8xde+ddMdW8Z7L0og6W5gWqhFT2t86p8On 9F1bB5VGzekho5myEfgG7H70L8z9la8+g9FmHIwjxcGDf1Z1bemRx/tPcpIP4RyrPGVj czdPUZgRkbXGHFHFw0wukrZNa4M2DpXBfbhvsGDzpa0BBHMPMhLMOkRr0fw65S28ygZ4 2O265i7+8RWarYHRVoVQxPMPD+LUfqvWWzWBg3OySBt4Zr2g5Y2c1rifNoHvwbJrb8kD 0gHQ== X-Forwarded-Encrypted: i=1; AJvYcCUooAxKLiEwNNAag1je/gt3IsMSFYsPPnTUKWdgrnay8s/LA16lCxDevOYI1N6Db+Mey/4pKy9jc28nNtEi/Qz4QjbgUqLzHXdxJJqIeSuyFqQy0djJLRn+KLzi8ouH6RorRT/h6KILYQ== X-Gm-Message-State: AOJu0Yw5nRr/T+Ez5+Ka6PZkm3hVVuDzA4KRt5AMm6Z8JRzVN63d0cnY 1xKKApidx4ie9zcyvW7K6/jqaLPi73V+dO1F/erqncTAepZjk6Sf X-Google-Smtp-Source: AGHT+IHCqWTpNSXlvdUZw5ucdR3X7rIc7toNINCsl6MqALgYwnAAdl+NtVvIB095bO1OU6/08opuLg== X-Received: by 2002:a05:6871:a4ca:b0:229:faa9:3b35 with SMTP id wb10-20020a056871a4ca00b00229faa93b35mr12606534oab.21.1714410310612; Mon, 29 Apr 2024 10:05:10 -0700 (PDT) Received: from localhost.localdomain ([70.114.203.196]) by smtp.gmail.com with ESMTPSA id g1-20020a9d6201000000b006ea20712e66sm4074448otj.17.2024.04.29.10.05.07 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 29 Apr 2024 10:05:10 -0700 (PDT) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Jonathan Corbet , Jonathan Cameron , Dan Williams , Vishal Verma , Dave Jiang , Alexander Viro , Christian Brauner , Jan Kara , Matthew Wilcox , linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev Cc: John Groves , john@jagalactic.com, Dave Chinner , Christoph Hellwig , dave.hansen@linux.intel.com, gregory.price@memverge.com, Randy Dunlap , Jerome Glisse , Aravind Ramesh , Ajay Joshi , Eishan Mirakhur , Ravi Shankar , Srinivasulu Thanneeru , Luis Chamberlain , Amir Goldstein , Chandan Babu R , Bagas Sanjaya , "Darrick J . Wong" , Kent Overstreet , Steve French , Nathan Lynch , Michael Ellerman , Thomas Zimmermann , Julien Panis , Stanislav Fomichev , Dongsheng Yang , John Groves Subject: [RFC PATCH v2 06/12] dev_dax_iomap: export dax_dev_get() Date: Mon, 29 Apr 2024 12:04:22 -0500 Message-Id: X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 famfs needs access to dev_dax_get() Signed-off-by: John Groves --- drivers/dax/super.c | 3 ++- include/linux/dax.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/dax/super.c b/drivers/dax/super.c index 4b55f79849b0..8475093ba973 100644 --- a/drivers/dax/super.c +++ b/drivers/dax/super.c @@ -452,7 +452,7 @@ static int dax_set(struct inode *inode, void *data) return 0; } -static struct dax_device *dax_dev_get(dev_t devt) +struct dax_device *dax_dev_get(dev_t devt) { struct dax_device *dax_dev; struct inode *inode; @@ -475,6 +475,7 @@ static struct dax_device *dax_dev_get(dev_t devt) return dax_dev; } +EXPORT_SYMBOL_GPL(dax_dev_get); struct dax_device *alloc_dax(void *private, const struct dax_operations *ops) { diff --git a/include/linux/dax.h b/include/linux/dax.h index 4a86716f932a..29d3dd6452c3 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -61,6 +61,7 @@ struct dax_device *alloc_dax(void *private, const struct dax_operations *ops); #if IS_ENABLED(CONFIG_DEV_DAX_IOMAP) int fs_dax_get(struct dax_device *dax_dev, void *holder, const struct dax_holder_operations *hops); struct dax_device *inode_dax(struct inode *inode); +struct dax_device *dax_dev_get(dev_t devt); #endif void *dax_holder(struct dax_device *dax_dev); void put_dax(struct dax_device *dax_dev); From patchwork Mon Apr 29 17:04:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Groves X-Patchwork-Id: 13647390 Received: from mail-ot1-f48.google.com (mail-ot1-f48.google.com [209.85.210.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DDAC68626D; Mon, 29 Apr 2024 17:05:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410317; cv=none; b=JcNB/XCBl8ovpUPaDEjEs79M8kKJAjVIhF5aIGRm5LkrmARl00iqsKyVmVgi8ZqSyiD/fwuV0HnF4AQdVp3TKWK35P6WwlBe3BtbQ8C2BFMogyjhRGt1QYFxQEmseYu9/3SeHcjTUMJJWtmjracBjBd1TVSTFIj27K1xWe5xzko= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410317; c=relaxed/simple; bh=soONSsHFI6nzOHKUp35TcQ0GKddlcVxDmE7JFRha/Gg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=CwOJxHRA1jdHPBSPlIU/EmgqzPzlKjJeIxDjmiBQZ0U1GpgfZS8XVKZwsdXMiHVArqMLNq7Ac2chfAhw3XC7wPuXayM9kA4hvBVgYLjS+IgyZMWKniC2GCiqes9oXYPUXPb54ZkrZzfjdv7cZbkLlhrZoSfW+B3h7G0OaGbD5E0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=nbHLWhEl; arc=none smtp.client-ip=209.85.210.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="nbHLWhEl" Received: by mail-ot1-f48.google.com with SMTP id 46e09a7af769-6ee2d64423cso897105a34.2; Mon, 29 Apr 2024 10:05:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714410315; x=1715015115; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=eRZMXpVDaupswF2EGFq0CRj0FomMbu4maf2unw0x+4s=; b=nbHLWhEl90dXqvlIbUmOzj56hjzb7lxkdSWPLAUV6bywM5YEbP3MYfoJY1QPvrwbFo dXBlZZ09GwFSGEcb6NjhhJxulrW0GSdkSEYclQXN5C6oeSeSnEfA3RAxjx+KqVKyaefm E3THuGap8OojEwP+60i+eEqMwkZ4pOsamIyVDAhoC4aOeC+c4PVVE+4CN6Q4Q5QUD7fi PpLOnGZAbpZrb7nDVishtduis5UCxC5od6dUDLYgETc8fceGWQSJrSFpmM/l7MBxSTSr QQhdek2/nP19Hur/TnJhEZsCozw9OHb77apARf44rsDKofqOLqmQ2pSqb2P3pDSVHbtW NO9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714410315; x=1715015115; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=eRZMXpVDaupswF2EGFq0CRj0FomMbu4maf2unw0x+4s=; b=GICtQPrviX760g8rYFy+L7mnNDa3YHAPA1WKu3s7h/jYiMhljFYri2CgU0dX7rS/dp nj+J/0rKbJP/dV3xXPu6v+ybJBAESNEb1951kaySXCzhHVYA0g5hiHStk8C4YLGVI/iZ ILyJvkCL4Z59/6m35inn0iMnvlcJT5+wPHPGaoBm810IuhPuMfgpaWeGQsGGLEQnErtr ru9HBNQ3cyndQqIUdO9nxaDGCrYRa9fXzrxmm0N8bf69TlF7OkAjLJ9zjG5JPl1OLTkf bGe4ELw/bmLHbKMnpKn9Hse1ZbsC5Efj1ykGTM1zKB5YJZkyCEWoEpDZxSW9lgSAKnPk j1Ww== X-Forwarded-Encrypted: i=1; AJvYcCXPzU5tO3bmGDNP++4zfrexUrN3ySBqFeyf16bEKN5d9DJCp5+v2qpz4lp6Vo0WrlAJMdJBlq1mTTmmqKHLFXwVfem1SLkXeWFPrlDPYRT7f3Bz4zVRrGWblJNjsgZIc5VUAHMt7w8ASg== X-Gm-Message-State: AOJu0YzLIb/OISfBB9amIWFwnttcAqr7UYfjcTqMFR+2CPcqLOUrZG0j SG3WY/omjWC0m1feV7VcfmsHMPS2+6NB1/iEvwqisr5+V7ipa1/Z X-Google-Smtp-Source: AGHT+IFAZMCprMAfngbGfmfnfc0KD83EThJP15yBR4/UvGbdxcZfQLPaHwO+y+izxvLX28mjpzQ3QA== X-Received: by 2002:a05:6830:1516:b0:6ee:3710:231c with SMTP id k22-20020a056830151600b006ee3710231cmr3205587otp.2.1714410315062; Mon, 29 Apr 2024 10:05:15 -0700 (PDT) Received: from localhost.localdomain ([70.114.203.196]) by smtp.gmail.com with ESMTPSA id g1-20020a9d6201000000b006ea20712e66sm4074448otj.17.2024.04.29.10.05.12 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 29 Apr 2024 10:05:14 -0700 (PDT) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Jonathan Corbet , Jonathan Cameron , Dan Williams , Vishal Verma , Dave Jiang , Alexander Viro , Christian Brauner , Jan Kara , Matthew Wilcox , linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev Cc: John Groves , john@jagalactic.com, Dave Chinner , Christoph Hellwig , dave.hansen@linux.intel.com, gregory.price@memverge.com, Randy Dunlap , Jerome Glisse , Aravind Ramesh , Ajay Joshi , Eishan Mirakhur , Ravi Shankar , Srinivasulu Thanneeru , Luis Chamberlain , Amir Goldstein , Chandan Babu R , Bagas Sanjaya , "Darrick J . Wong" , Kent Overstreet , Steve French , Nathan Lynch , Michael Ellerman , Thomas Zimmermann , Julien Panis , Stanislav Fomichev , Dongsheng Yang , John Groves Subject: [RFC PATCH v2 07/12] famfs prep: Add fs/super.c:kill_char_super() Date: Mon, 29 Apr 2024 12:04:23 -0500 Message-Id: X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Famfs needs a slightly different kill_super variant than already existed. Putting it local to famfs would require exporting d_genocide(); this seemed a bit cleaner. Signed-off-by: John Groves --- fs/super.c | 9 +++++++++ include/linux/fs.h | 1 + 2 files changed, 10 insertions(+) diff --git a/fs/super.c b/fs/super.c index 69ce6c600968..cd276d30b522 100644 --- a/fs/super.c +++ b/fs/super.c @@ -1236,6 +1236,15 @@ void kill_litter_super(struct super_block *sb) } EXPORT_SYMBOL(kill_litter_super); +void kill_char_super(struct super_block *sb) +{ + if (sb->s_root) + d_genocide(sb->s_root); + generic_shutdown_super(sb); + kill_super_notify(sb); +} +EXPORT_SYMBOL(kill_char_super); + int set_anon_super_fc(struct super_block *sb, struct fs_context *fc) { return set_anon_super(sb, NULL); diff --git a/include/linux/fs.h b/include/linux/fs.h index 8dfd53b52744..cc586f30397d 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2511,6 +2511,7 @@ void generic_shutdown_super(struct super_block *sb); void kill_block_super(struct super_block *sb); void kill_anon_super(struct super_block *sb); void kill_litter_super(struct super_block *sb); +void kill_char_super(struct super_block *sb); void deactivate_super(struct super_block *sb); void deactivate_locked_super(struct super_block *sb); int set_anon_super(struct super_block *s, void *data); From patchwork Mon Apr 29 17:04:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Groves X-Patchwork-Id: 13647391 Received: from mail-ot1-f52.google.com (mail-ot1-f52.google.com [209.85.210.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 38C0386636; Mon, 29 Apr 2024 17:05:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410323; cv=none; b=dwuS81OXl3M749tkfWrhVP+keLZ7WsMu7Dw2F0UWMLOKlP4FU2zW3d/ENdu4MHBCUEYHG1C1ILG2kKdfsms6FsDh9CwkpBYX5tNioR+wQ6zCCQc1V3U8dngGjTfk7rXnhl4ghVXKP7MjSWwFSSSIduaG3In2kzvxpZs3pPbuqcw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410323; c=relaxed/simple; bh=jsPGyOwkC8X4WUgX0lbpTX6dCPCsii/eyuR6Mz+GpWs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=fbslYsVUPoHgt/5ntLVeVSXogAjE+jydMZExPte9/cjc7ji3B7UUazUJdR4QNSImNBUBtSQBeN/lJkhn4Naa9p2tTuVGrlO6n9L4mlnaZXMU5IzNnOXCHoH+GXFUkKXpRC1ynFJJgBdNzamFedveLI/TPiSfzpH+rBQoNCSl448= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Sz7JzbAI; arc=none smtp.client-ip=209.85.210.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Sz7JzbAI" Received: by mail-ot1-f52.google.com with SMTP id 46e09a7af769-6ee575da779so205353a34.2; Mon, 29 Apr 2024 10:05:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714410319; x=1715015119; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=MilasPvhzvPiGTInd59RFs02ReQyQfw/VRapBOjtlfM=; b=Sz7JzbAIfY9KstG7hfiL18NohUUTDpzyovs7i8Ek2Q4TtwH+XExZit0cRSzp+Mc/JQ 8+Xn9ZOJMoVXc6mCpwI5L9BrhJ9r79qo8ZOBLmFGxQO372qWxV275HKY5QADHHyE0/H5 7CHBEHQrGh2KVPBUkaycTQHn1tICECXLQIzXoHjDqQFrx3ttmFI1QKgzbV/MwvbjHxPs i8EBtfq4isQiM6EreVDhg7iT1vI3HWBrWcO7NhhekJXV/Q4q/DcBMYtOz8VlDn1/oFIu Dai+nEAvyZf4QMdkJS1QJwqZWgwDqZ1QpGmf85R0wWZM6kDMuwnA7j0J0V7CtEejwr89 ncjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714410319; x=1715015119; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=MilasPvhzvPiGTInd59RFs02ReQyQfw/VRapBOjtlfM=; b=aTI7/1dDDh0lQqtolOAoCNYrTbOHgGdqxRcQIkY+8U99FzOSSjAE+ozg3n3mVtRUPr Whf8uY32ZFTC+Tl82xhqslxv2SZXk6zIaRt2MpNN7cMMADAfTcc8kK0TOI++mHYj8CEe 0fNqcV8/w6b+dusMWbZzVPOtSDAG3Js3GH3F/JZhy7zF+OihDDD58dOtFSNmMZ48Yb2+ C1lCeSg767rPQNkqppB0yj+oICvZdweYjIjJTamtWeXHHQ4RuoMzd3Yn8yA1eupK/gNs pNw35fxb7m+RcPyW33cOjLmVE7+CkoqURnwTifxzid6XFII67AZ59+i/bR4pfPjknPUQ +lDQ== X-Forwarded-Encrypted: i=1; AJvYcCX5HLhEro1TTDo585K/8W2jFn2LIu4le1/7EXt3TTn5RChufxhWDIFTgkuWsP65K7i3Uq1/q95JYsMt1RuZ49kE2VSv2vL4X5ycs4OPem51vW1C5hhXDmOC+Sf0m0tSotJnGRsr88V1Kg== X-Gm-Message-State: AOJu0YxKizLkIFTKpyxLv/z1/qV3nIuddUNCC6IKBhV0w8Yyl3TtDGko 9+DIKdLzQbzZRscXqzTUgVZoX0F9+0HRAI63NTpiKSmaV5vySJF2 X-Google-Smtp-Source: AGHT+IGvLXcUVeIkg6vV7OMCWw5YqAe1KWWabhYTgi9tEZ/mgodSlwYBR1JbErJxPxFlPTBLShZEXQ== X-Received: by 2002:a05:6830:16d5:b0:6ee:2a3a:566 with SMTP id l21-20020a05683016d500b006ee2a3a0566mr4642555otr.14.1714410318984; Mon, 29 Apr 2024 10:05:18 -0700 (PDT) Received: from localhost.localdomain ([70.114.203.196]) by smtp.gmail.com with ESMTPSA id g1-20020a9d6201000000b006ea20712e66sm4074448otj.17.2024.04.29.10.05.16 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 29 Apr 2024 10:05:18 -0700 (PDT) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Jonathan Corbet , Jonathan Cameron , Dan Williams , Vishal Verma , Dave Jiang , Alexander Viro , Christian Brauner , Jan Kara , Matthew Wilcox , linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev Cc: John Groves , john@jagalactic.com, Dave Chinner , Christoph Hellwig , dave.hansen@linux.intel.com, gregory.price@memverge.com, Randy Dunlap , Jerome Glisse , Aravind Ramesh , Ajay Joshi , Eishan Mirakhur , Ravi Shankar , Srinivasulu Thanneeru , Luis Chamberlain , Amir Goldstein , Chandan Babu R , Bagas Sanjaya , "Darrick J . Wong" , Kent Overstreet , Steve French , Nathan Lynch , Michael Ellerman , Thomas Zimmermann , Julien Panis , Stanislav Fomichev , Dongsheng Yang , John Groves Subject: [RFC PATCH v2 08/12] famfs: module operations & fs_context Date: Mon, 29 Apr 2024 12:04:24 -0500 Message-Id: <86694a1a663ab0b6e8e35c7b187f5ad179103482.1714409084.git.john@groves.net> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Start building up from the famfs module operations. This commit includes the following: * Register as a file system * Parse mount parameters * Allocate or find (and initialize) a superblock via famfs_get_tree() * Lookup the host dax device, and bail if it's in use (or not dax) * Register as the holder of the dax device if it's available * Add Kconfig and Makefile misc to build famfs * Add FAMFS_SUPER_MAGIC to include/uapi/linux/magic.h * Add export of fs/namei.c:may_open_dev(), which famfs needs to call * Update MAINTAINERS file for the fs/famfs/ path The following exports had to happen to enable famfs: * This uses the new fs/super.c:kill_char_super() - the other kill*super helpers were not quite right. * This uses the dev_dax_iomap export of dax_dev_get() This commit builds but is otherwise too incomplete to run Signed-off-by: John Groves --- MAINTAINERS | 1 + fs/Kconfig | 2 + fs/Makefile | 1 + fs/famfs/Kconfig | 10 ++ fs/famfs/Makefile | 5 + fs/famfs/famfs_inode.c | 345 +++++++++++++++++++++++++++++++++++++ fs/famfs/famfs_internal.h | 36 ++++ fs/namei.c | 1 + include/uapi/linux/magic.h | 1 + 9 files changed, 402 insertions(+) create mode 100644 fs/famfs/Kconfig create mode 100644 fs/famfs/Makefile create mode 100644 fs/famfs/famfs_inode.c create mode 100644 fs/famfs/famfs_internal.h diff --git a/MAINTAINERS b/MAINTAINERS index 3f2d847dcf01..365d678e2f40 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8188,6 +8188,7 @@ L: linux-cxl@vger.kernel.org L: linux-fsdevel@vger.kernel.org S: Supported F: Documentation/filesystems/famfs.rst +F: fs/famfs FANOTIFY M: Jan Kara diff --git a/fs/Kconfig b/fs/Kconfig index a46b0cbc4d8f..53b4629e92a0 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -140,6 +140,8 @@ source "fs/autofs/Kconfig" source "fs/fuse/Kconfig" source "fs/overlayfs/Kconfig" +source "fs/famfs/Kconfig" + menu "Caches" source "fs/netfs/Kconfig" diff --git a/fs/Makefile b/fs/Makefile index 6ecc9b0a53f2..3393f399a9e9 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -129,3 +129,4 @@ obj-$(CONFIG_EFIVAR_FS) += efivarfs/ obj-$(CONFIG_EROFS_FS) += erofs/ obj-$(CONFIG_VBOXSF_FS) += vboxsf/ obj-$(CONFIG_ZONEFS_FS) += zonefs/ +obj-$(CONFIG_FAMFS) += famfs/ diff --git a/fs/famfs/Kconfig b/fs/famfs/Kconfig new file mode 100644 index 000000000000..edb8980820f7 --- /dev/null +++ b/fs/famfs/Kconfig @@ -0,0 +1,10 @@ + + +config FAMFS + tristate "famfs: shared memory file system" + depends on DEV_DAX && FS_DAX && DEV_DAX_IOMAP + help + Support for the famfs file system. Famfs is a dax file system that + can support scale-out shared access to fabric-attached memory + (e.g. CXL shared memory). Famfs is not a general purpose file system; + it is an enabler for data sets in shared memory. diff --git a/fs/famfs/Makefile b/fs/famfs/Makefile new file mode 100644 index 000000000000..62230bcd6793 --- /dev/null +++ b/fs/famfs/Makefile @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0 + +obj-$(CONFIG_FAMFS) += famfs.o + +famfs-y := famfs_inode.o diff --git a/fs/famfs/famfs_inode.c b/fs/famfs/famfs_inode.c new file mode 100644 index 000000000000..61306240fc0b --- /dev/null +++ b/fs/famfs/famfs_inode.c @@ -0,0 +1,345 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * famfs - dax file system for shared fabric-attached memory + * + * Copyright 2023-2024 Micron Technology, inc + * + * This file system, originally based on ramfs the dax support from xfs, + * is intended to allow multiple host systems to mount a common file system + * view of dax files that map to shared memory. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "famfs_internal.h" + +#define FAMFS_DEFAULT_MODE 0755 + +static struct inode *famfs_get_inode(struct super_block *sb, + const struct inode *dir, + umode_t mode, dev_t dev) +{ + struct inode *inode = new_inode(sb); + struct timespec64 tv; + + if (!inode) + return NULL; + + inode->i_ino = get_next_ino(); + inode_init_owner(&nop_mnt_idmap, inode, dir, mode); + inode->i_mapping->a_ops = &ram_aops; + mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); + mapping_set_unevictable(inode->i_mapping); + tv = inode_set_ctime_current(inode); + inode_set_mtime_to_ts(inode, tv); + inode_set_atime_to_ts(inode, tv); + + switch (mode & S_IFMT) { + default: + init_special_inode(inode, mode, dev); + break; + case S_IFREG: + inode->i_op = NULL /* famfs_file_inode_operations */; + inode->i_fop = NULL /* &famfs_file_operations */; + break; + case S_IFDIR: + inode->i_op = NULL /* famfs_dir_inode_operations */; + inode->i_fop = &simple_dir_operations; + + /* Directory inodes start off with i_nlink == 2 (for ".") */ + inc_nlink(inode); + break; + case S_IFLNK: + inode->i_op = &page_symlink_inode_operations; + inode_nohighmem(inode); + break; + } + return inode; +} + +/* + * famfs dax_operations (for char dax) + */ +static int +famfs_dax_notify_failure(struct dax_device *dax_dev, u64 offset, + u64 len, int mf_flags) +{ + struct super_block *sb = dax_holder(dax_dev); + struct famfs_fs_info *fsi = sb->s_fs_info; + + pr_err("%s: rootdev=%s offset=%lld len=%llu flags=%x\n", __func__, + fsi->rootdev, offset, len, mf_flags); + + return 0; +} + +static const struct dax_holder_operations famfs_dax_holder_ops = { + .notify_failure = famfs_dax_notify_failure, +}; + +/***************************************************************************** + * fs_context_operations + */ + +static int +famfs_fill_super(struct super_block *sb, struct fs_context *fc) +{ + int rc = 0; + + sb->s_maxbytes = MAX_LFS_FILESIZE; + sb->s_blocksize = PAGE_SIZE; + sb->s_blocksize_bits = PAGE_SHIFT; + sb->s_magic = FAMFS_SUPER_MAGIC; + sb->s_op = NULL /* famfs_super_ops */; + sb->s_time_gran = 1; + + return rc; +} + +static int +lookup_daxdev(const char *pathname, dev_t *devno) +{ + struct inode *inode; + struct path path; + int err; + + if (!pathname || !*pathname) + return -EINVAL; + + err = kern_path(pathname, LOOKUP_FOLLOW, &path); + if (err) + return err; + + inode = d_backing_inode(path.dentry); + if (!S_ISCHR(inode->i_mode)) { + err = -EINVAL; + goto out_path_put; + } + + if (!may_open_dev(&path)) { /* had to export this */ + err = -EACCES; + goto out_path_put; + } + + /* if it's dax, i_rdev is struct dax_device */ + *devno = inode->i_rdev; + +out_path_put: + path_put(&path); + return err; +} + +static int +famfs_get_tree(struct fs_context *fc) +{ + struct famfs_fs_info *fsi = fc->s_fs_info; + struct dax_device *dax_devp; + struct super_block *sb; + struct inode *inode; + dev_t daxdevno; + int err; + + /* TODO: clean up chatty messages */ + + err = lookup_daxdev(fc->source, &daxdevno); + if (err) + return err; + + fsi->daxdevno = daxdevno; + + /* This will set sb->s_dev=daxdevno */ + sb = sget_dev(fc, daxdevno); + if (IS_ERR(sb)) { + pr_err("%s: sget_dev error\n", __func__); + return PTR_ERR(sb); + } + + if (sb->s_root) { + pr_info("%s: found a matching suerblock for %s\n", + __func__, fc->source); + + /* We don't expect to find a match by dev_t; if we do, it must + * already be mounted, so we bail + */ + err = -EBUSY; + goto deactivate_out; + } else { + pr_info("%s: initializing new superblock for %s\n", + __func__, fc->source); + err = famfs_fill_super(sb, fc); + if (err) + goto deactivate_out; + } + + /* This will fail if it's not a dax device */ + dax_devp = dax_dev_get(daxdevno); + if (!dax_devp) { + pr_warn("%s: device %s not found or not dax\n", + __func__, fc->source); + err = -ENODEV; + goto deactivate_out; + } + + err = fs_dax_get(dax_devp, sb, &famfs_dax_holder_ops); + if (err) { + pr_err("%s: fs_dax_get(%lld) failed\n", __func__, (u64)daxdevno); + err = -EBUSY; + goto deactivate_out; + } + fsi->dax_devp = dax_devp; + + inode = famfs_get_inode(sb, NULL, S_IFDIR | fsi->mount_opts.mode, 0); + sb->s_root = d_make_root(inode); + if (!sb->s_root) { + pr_err("%s: d_make_root() failed\n", __func__); + err = -ENOMEM; + fs_put_dax(fsi->dax_devp, sb); + goto deactivate_out; + } + + sb->s_flags |= SB_ACTIVE; + + WARN_ON(fc->root); + fc->root = dget(sb->s_root); + return err; + +deactivate_out: + pr_debug("%s: deactivating sb=%llx\n", __func__, (u64)sb); + deactivate_locked_super(sb); + return err; +} + +/*****************************************************************************/ + +enum famfs_param { + Opt_mode, + Opt_dax, +}; + +const struct fs_parameter_spec famfs_fs_parameters[] = { + fsparam_u32oct("mode", Opt_mode), + fsparam_string("dax", Opt_dax), + {} +}; + +static int famfs_parse_param(struct fs_context *fc, struct fs_parameter *param) +{ + struct famfs_fs_info *fsi = fc->s_fs_info; + struct fs_parse_result result; + int opt; + + opt = fs_parse(fc, famfs_fs_parameters, param, &result); + if (opt == -ENOPARAM) { + opt = vfs_parse_fs_param_source(fc, param); + if (opt != -ENOPARAM) + return opt; + + return 0; + } + if (opt < 0) + return opt; + + switch (opt) { + case Opt_mode: + fsi->mount_opts.mode = result.uint_32 & S_IALLUGO; + break; + case Opt_dax: + if (strcmp(param->string, "always")) + pr_notice("%s: invalid dax mode %s\n", + __func__, param->string); + break; + } + + return 0; +} + +static void famfs_free_fc(struct fs_context *fc) +{ + struct famfs_fs_info *fsi = fc->s_fs_info; + + if (fsi && fsi->rootdev) + kfree(fsi->rootdev); + + kfree(fsi); +} + +static const struct fs_context_operations famfs_context_ops = { + .free = famfs_free_fc, + .parse_param = famfs_parse_param, + .get_tree = famfs_get_tree, +}; + +static int famfs_init_fs_context(struct fs_context *fc) +{ + struct famfs_fs_info *fsi; + + fsi = kzalloc(sizeof(*fsi), GFP_KERNEL); + if (!fsi) + return -ENOMEM; + + fsi->mount_opts.mode = FAMFS_DEFAULT_MODE; + fc->s_fs_info = fsi; + fc->ops = &famfs_context_ops; + return 0; +} + +static void famfs_kill_sb(struct super_block *sb) +{ + struct famfs_fs_info *fsi = sb->s_fs_info; + + if (fsi->dax_devp) + fs_put_dax(fsi->dax_devp, sb); + if (fsi && fsi->rootdev) + kfree(fsi->rootdev); + kfree(fsi); + sb->s_fs_info = NULL; + + kill_char_super(sb); /* new */ +} + +#define MODULE_NAME "famfs" +static struct file_system_type famfs_fs_type = { + .name = MODULE_NAME, + .init_fs_context = famfs_init_fs_context, + .parameters = famfs_fs_parameters, + .kill_sb = famfs_kill_sb, + .fs_flags = FS_USERNS_MOUNT, +}; + +/****************************************************************************** + * Module stuff + */ +static int __init init_famfs_fs(void) +{ + int rc; + + rc = register_filesystem(&famfs_fs_type); + + return rc; +} + +static void +__exit famfs_exit(void) +{ + unregister_filesystem(&famfs_fs_type); + pr_info("%s: unregistered\n", __func__); +} + +fs_initcall(init_famfs_fs); +module_exit(famfs_exit); + +MODULE_AUTHOR("John Groves, Micron Technology"); +MODULE_LICENSE("GPL"); diff --git a/fs/famfs/famfs_internal.h b/fs/famfs/famfs_internal.h new file mode 100644 index 000000000000..951b32ec4fbd --- /dev/null +++ b/fs/famfs/famfs_internal.h @@ -0,0 +1,36 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * famfs - dax file system for shared fabric-attached memory + * + * Copyright 2023-2024 Micron Technology, Inc. + * + * This file system, originally based on ramfs the dax support from xfs, + * is intended to allow multiple host systems to mount a common file system + * view of dax files that map to shared memory. + */ +#ifndef FAMFS_INTERNAL_H +#define FAMFS_INTERNAL_H + +struct famfs_mount_opts { + umode_t mode; +}; + +/** + * @famfs_fs_info + * + * @mount_opts: the mount options + * @dax_devp: The underlying character devdax device + * @rootdev: Dax device path used in mount + * @daxdevno: Dax device dev_t + * @deverror: True if the dax device has called our notify_failure entry + * point, or if other "shutdown" conditions exist + */ +struct famfs_fs_info { + struct famfs_mount_opts mount_opts; + struct dax_device *dax_devp; + char *rootdev; + dev_t daxdevno; + bool deverror; +}; + +#endif /* FAMFS_INTERNAL_H */ diff --git a/fs/namei.c b/fs/namei.c index c5b2a25be7d0..f24b268473cd 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3229,6 +3229,7 @@ bool may_open_dev(const struct path *path) return !(path->mnt->mnt_flags & MNT_NODEV) && !(path->mnt->mnt_sb->s_iflags & SB_I_NODEV); } +EXPORT_SYMBOL(may_open_dev); static int may_open(struct mnt_idmap *idmap, const struct path *path, int acc_mode, int flag) diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h index 1b40a968ba91..e9bdd6a415e2 100644 --- a/include/uapi/linux/magic.h +++ b/include/uapi/linux/magic.h @@ -37,6 +37,7 @@ #define HOSTFS_SUPER_MAGIC 0x00c0ffee #define OVERLAYFS_SUPER_MAGIC 0x794c7630 #define FUSE_SUPER_MAGIC 0x65735546 +#define FAMFS_SUPER_MAGIC 0x87b282ff #define MINIX_SUPER_MAGIC 0x137F /* minix v1 fs, 14 char names */ #define MINIX_SUPER_MAGIC2 0x138F /* minix v1 fs, 30 char names */ From patchwork Mon Apr 29 17:04:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Groves X-Patchwork-Id: 13647392 Received: from mail-ot1-f53.google.com (mail-ot1-f53.google.com [209.85.210.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E9D85127B45; Mon, 29 Apr 2024 17:05:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410326; cv=none; b=Rp+QpkUefA2BqkvkToUfF9ii72OPfLq8eNtDBQFyRB3DG0BFymKivmBvUBumt+9JixzASkBIXAaYO3iK6JYjvfZ+p/V8+nwxG2E3Lw0jrsTWNLXQWZMf43pI7jYLBwI+2hg0vopoTv+d8ArstA0B/1/NcYHNnpuNWWQ/Lu2n1t0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410326; c=relaxed/simple; bh=3okyjXHCMhcAJ0EEcEcf6cQaC7PkHdUAYWpAVtSMGPY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=S9Ha8DuorZHcX14WXutGHsxgDMbf12TiG2nGcfRlbaKhZ99CzWRuZKHLpE6/osxN4V4yYhlQBToXuleoxeI/31E8OcapRPAnUwUgfQYb0gYAKxglUltpA00E6F2R9DoNWvf5qG9kFxpavWVs6oAIeFN/MCn6P8x6Ls1T50qUCDk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=QWjl1urZ; arc=none smtp.client-ip=209.85.210.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="QWjl1urZ" Received: by mail-ot1-f53.google.com with SMTP id 46e09a7af769-6ea2375d8d0so3895028a34.0; Mon, 29 Apr 2024 10:05:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714410323; x=1715015123; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=2UaofW1Ss7ysgCpPsl4E9jKgtjI2OUWLad4tGWrXZwU=; b=QWjl1urZ0wbXplKUjcZEeCDJu/tiuW9/z3PZllkb4HWCedlObKDJEIf9qfO4vL8vGy MdsnU/ljt0C/rdmmt6IM/FFh7EFpTIbzxeSXWO6OByBDZddD1v66lS0LQkVxQ2yHfZCa xzaw/T3226aH70LQCgip6axYFX5BLzsYFcHYz1x5Ttj4d/dOzfCMHOmegLw8QPtjftXU llTmnuvKXpcjM+dQa91ibhtoeoTdGQLSwwTYdsF4mznH3jg+HiajrkbA5n6gBfsj814q fGOArCb9Jml569izgdIZ3BYP2Q94sUBHT7Aq7pncfmd7/q2LbYOioY9KiViknmEuZ/Rd Xz7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714410323; x=1715015123; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=2UaofW1Ss7ysgCpPsl4E9jKgtjI2OUWLad4tGWrXZwU=; b=jbGwngemDJZtwriaUpvyNtI4DSl72RLKyVQ1cMP0QJpMrXWwxvM/q723yjUS0oSgn1 y6lDuBq6RXYx/jervI9M6HM5tGxQ7ZECiHKRgDyW8xubuiI0rH/QqHnjk8iPQoUCO0y6 ykL/iCkQUH/P4s++jDSsvi1h5jOeGx0VS6qpwmJg4Rj4KWVLWTTZhJqKKqhD73Bno9hI XvLooZMQe+6io5Skb/ItFGdb2Dy6KZqRBBxq5fMUYoNGYVIyz5p861iE5lFuaHpArptq MMonq3fH0N2qHTWepAV2mg/bLGnpR1s7WDgSQmid6af689d1nWHmEFb3P2aWIESYDTlh /nQA== X-Forwarded-Encrypted: i=1; AJvYcCWbd4duhT5vSHMp4N2mW9Inm1xp6yU0l1Z71JJDSBVmUCXAvxy7yU3ZEL2w7638Ddkf61HBi8nhym2ID7lYcNwmycZHVp0MorcBwUhEt58D51xYhZkBc3VAfXYn2ShERvstmwbzuOYRLQ== X-Gm-Message-State: AOJu0YzSsZCkpt5bhG+KO9Xoa7nrSjq1MA3eG8pmEsgXJjzKWoRB4sTD kBn7AIju8vpCRQ5p+dWM95DttvEKTeWPEaWpIYYHrTydpUEz3rBhEz+n+LhT X-Google-Smtp-Source: AGHT+IFvD88Udyo+NvOBr4PNFG9gOR+wiZU+XL9Ie5iTR0+/Cv+NkgpNj5yEeSCkfCvrufai21KfAw== X-Received: by 2002:a05:6830:4513:b0:6eb:d349:8c3f with SMTP id i19-20020a056830451300b006ebd3498c3fmr13821572otv.28.1714410323111; Mon, 29 Apr 2024 10:05:23 -0700 (PDT) Received: from localhost.localdomain ([70.114.203.196]) by smtp.gmail.com with ESMTPSA id g1-20020a9d6201000000b006ea20712e66sm4074448otj.17.2024.04.29.10.05.20 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 29 Apr 2024 10:05:22 -0700 (PDT) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Jonathan Corbet , Jonathan Cameron , Dan Williams , Vishal Verma , Dave Jiang , Alexander Viro , Christian Brauner , Jan Kara , Matthew Wilcox , linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev Cc: John Groves , john@jagalactic.com, Dave Chinner , Christoph Hellwig , dave.hansen@linux.intel.com, gregory.price@memverge.com, Randy Dunlap , Jerome Glisse , Aravind Ramesh , Ajay Joshi , Eishan Mirakhur , Ravi Shankar , Srinivasulu Thanneeru , Luis Chamberlain , Amir Goldstein , Chandan Babu R , Bagas Sanjaya , "Darrick J . Wong" , Kent Overstreet , Steve French , Nathan Lynch , Michael Ellerman , Thomas Zimmermann , Julien Panis , Stanislav Fomichev , Dongsheng Yang , John Groves Subject: [RFC PATCH v2 09/12] famfs: Introduce inode_operations and super_operations Date: Mon, 29 Apr 2024 12:04:25 -0500 Message-Id: X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The famfs inode and super operations are pretty much generic. This commit builds but is still too incomplete to run Signed-off-by: John Groves --- fs/famfs/famfs_inode.c | 113 +++++++++++++++++++++++++++++++++++++++-- 1 file changed, 110 insertions(+), 3 deletions(-) diff --git a/fs/famfs/famfs_inode.c b/fs/famfs/famfs_inode.c index 61306240fc0b..e00e9cdecadf 100644 --- a/fs/famfs/famfs_inode.c +++ b/fs/famfs/famfs_inode.c @@ -28,6 +28,9 @@ #define FAMFS_DEFAULT_MODE 0755 +static const struct inode_operations famfs_file_inode_operations; +static const struct inode_operations famfs_dir_inode_operations; + static struct inode *famfs_get_inode(struct super_block *sb, const struct inode *dir, umode_t mode, dev_t dev) @@ -52,11 +55,11 @@ static struct inode *famfs_get_inode(struct super_block *sb, init_special_inode(inode, mode, dev); break; case S_IFREG: - inode->i_op = NULL /* famfs_file_inode_operations */; + inode->i_op = &famfs_file_inode_operations; inode->i_fop = NULL /* &famfs_file_operations */; break; case S_IFDIR: - inode->i_op = NULL /* famfs_dir_inode_operations */; + inode->i_op = &famfs_dir_inode_operations; inode->i_fop = &simple_dir_operations; /* Directory inodes start off with i_nlink == 2 (for ".") */ @@ -70,6 +73,110 @@ static struct inode *famfs_get_inode(struct super_block *sb, return inode; } +/*************************************************************************** + * famfs inode_operations: these are currently pretty much boilerplate + */ + +static const struct inode_operations famfs_file_inode_operations = { + /* All generic */ + .setattr = simple_setattr, + .getattr = simple_getattr, +}; + +/* + * File creation. Allocate an inode, and we're done.. + */ +static int +famfs_mknod(struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry, + umode_t mode, dev_t dev) +{ + struct famfs_fs_info *fsi = dir->i_sb->s_fs_info; + struct timespec64 tv; + struct inode *inode; + + if (fsi->deverror) + return -ENODEV; + + inode = famfs_get_inode(dir->i_sb, dir, mode, dev); + if (!inode) + return -ENOSPC; + + d_instantiate(dentry, inode); + dget(dentry); /* Extra count - pin the dentry in core */ + tv = inode_set_ctime_current(inode); + inode_set_mtime_to_ts(inode, tv); + inode_set_atime_to_ts(inode, tv); + + return 0; +} + +static int famfs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) +{ + struct famfs_fs_info *fsi = dir->i_sb->s_fs_info; + int rc; + + if (fsi->deverror) + return -ENODEV; + + rc = famfs_mknod(&nop_mnt_idmap, dir, dentry, mode | S_IFDIR, 0); + if (rc) + return rc; + + inc_nlink(dir); + + return 0; +} + +static int famfs_create(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode, bool excl) +{ + struct famfs_fs_info *fsi = dir->i_sb->s_fs_info; + + if (fsi->deverror) + return -ENODEV; + + return famfs_mknod(&nop_mnt_idmap, dir, dentry, mode | S_IFREG, 0); +} + +static const struct inode_operations famfs_dir_inode_operations = { + .create = famfs_create, + .lookup = simple_lookup, + .link = simple_link, + .unlink = simple_unlink, + .mkdir = famfs_mkdir, + .rmdir = simple_rmdir, + .rename = simple_rename, +}; + +/***************************************************************************** + * famfs super_operations + * + * TODO: implement a famfs_statfs() that shows size, free and available space, + * etc. + */ + +/* + * famfs_show_options() - Display the mount options in /proc/mounts. + */ +static int famfs_show_options(struct seq_file *m, struct dentry *root) +{ + struct famfs_fs_info *fsi = root->d_sb->s_fs_info; + + if (fsi->mount_opts.mode != FAMFS_DEFAULT_MODE) + seq_printf(m, ",mode=%o", fsi->mount_opts.mode); + + return 0; +} + +static const struct super_operations famfs_super_ops = { + .statfs = simple_statfs, + .drop_inode = generic_delete_inode, + .show_options = famfs_show_options, +}; + +/*****************************************************************************/ + /* * famfs dax_operations (for char dax) */ @@ -103,7 +210,7 @@ famfs_fill_super(struct super_block *sb, struct fs_context *fc) sb->s_blocksize = PAGE_SIZE; sb->s_blocksize_bits = PAGE_SHIFT; sb->s_magic = FAMFS_SUPER_MAGIC; - sb->s_op = NULL /* famfs_super_ops */; + sb->s_op = &famfs_super_ops; sb->s_time_gran = 1; return rc; From patchwork Mon Apr 29 17:04:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Groves X-Patchwork-Id: 13647393 Received: from mail-ot1-f54.google.com (mail-ot1-f54.google.com [209.85.210.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8F67C86AFC; Mon, 29 Apr 2024 17:05:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410330; cv=none; b=jsXAe87opXa4zBDlEe/7MV/pzYCRmZQ8xoQW6yZkNeyNJ1dmvfwQIFXEUNjdhjwXzAZwAKB43Gpum1qHVqOGQeCrlO5TE2upjLhTIplpyLOv40DC9H4TtoJXBwn8Lr1EqgJEHtJf9mOe5iGZ6aWeWlUTM2QnUxe0WUvvQG/7MIk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410330; c=relaxed/simple; bh=f7nr9BHgMNY07SuBb7AMIuKL8EhAF6Lae/DCFaA0XMM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=B6ZiHLulbMcyAZ/AxtYumUs9YWLbaSSfzZSKfH9/fU355WtJ2snui9YZ3aiFMZ41NlHhgL6isGaZPTQpwhP4AvUnYTQhDqIZOji1Va0FP0GyG65kL5b4oPM/CA0YCsyfIolQC233qHwfdMczlUup0C6tDj7RXh2niMRhd+ZnlTY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=didp0iEx; arc=none smtp.client-ip=209.85.210.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="didp0iEx" Received: by mail-ot1-f54.google.com with SMTP id 46e09a7af769-6ea1a55b0c0so2432798a34.3; Mon, 29 Apr 2024 10:05:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714410328; x=1715015128; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=MvRMVdHLhDXhOx55ypqAZ1Z/pDm9XYtxQRRB/JnE99c=; b=didp0iExgM+JMPqPE0C7wqUFsDhroOnu9Jq6rSGAb95JxHl7gS7yRmHU9d/gwgBmGd d9swxdm12Z9NXbMLHcT1eVh2VG5qENmVwx4+r2wEzbwS0qdmhkXUjlNJXHHLbfZSkY0j 7lX1gpRNLMPow90jad9pRI42orAvXjOIobfiBsVgCeTkYrqTyr3FQJW46ZqGHvCDvDUB fYZrFLsUZ+HR1W4n+tck2Ro3mP4nVhNTGmcR0wwfW20hIw/KUGc4BodLGuBJ5GpnKf7L Kl14vZ/Q44zdfvz7LTEZzM9c5yJ8jQuZfYkDNu66vrqLByRdYFv0VpQE7pot5MXR/9fh omHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714410328; x=1715015128; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=MvRMVdHLhDXhOx55ypqAZ1Z/pDm9XYtxQRRB/JnE99c=; b=jBnx+ZBhlxSMCrgV9tbbhuVq4AEtzTB3bl+9oEtI8gz3jv05P2yd6Tuurq3SGLKnNZ Fs1MaOE64nVZHfYrqt9T7FmnTdAaMHnPQDPJESMoLb67zPhfZ1MJKX/TbS2ZUi2/2pfn vBVwO+46SGzkppt5a/gi9WcGzDhKDYNLdKZ3dtiNwPppMHPRZwDKLfj9LyuhcjE05ZgF K34U8Ow9ZKZvog2MrFhr1VOfOYSd+6hrT7cselA7U92oKzlm0wdwyE1V2eVVMbqGKDP4 XUWm6Ff7guvnuenL0aBZ4ne5xXmOC4o/p9VmV3+APR/thiMwdruLPu2nK2nuXaW+bAOz 0irQ== X-Forwarded-Encrypted: i=1; AJvYcCXGkEGMTGgV4JUS94CLONTOxvScWJee0YFErsjIHX+1ZVZbnZJ20LKBK0p8yAS0JPzKlBiBJ9CVlkf6P8G6429ZcrRk24gUx3Q4XwAAI23A9Bhg8H6qDhanfCAXkcBDZN5kq9DNDT6/xQ== X-Gm-Message-State: AOJu0YyoW5MvFMW+IGwkqcTKfv9iK9qjI6+tet8Xx47ItQkiJMPbVXus P5Vzro38sainp1DN+xs5I1WFFpMP7dh4YAY2Q5oz80brbyNRl2zc X-Google-Smtp-Source: AGHT+IG/ZcwB0huKygPBvFdcgDmjShYrcDklwOFhU2UIhkvvjhkEis6/aiEjDF1au3eTMnKT4Qud0g== X-Received: by 2002:a05:6830:59:b0:6ee:3232:160a with SMTP id d25-20020a056830005900b006ee3232160amr328210otp.38.1714410326944; Mon, 29 Apr 2024 10:05:26 -0700 (PDT) Received: from localhost.localdomain ([70.114.203.196]) by smtp.gmail.com with ESMTPSA id g1-20020a9d6201000000b006ea20712e66sm4074448otj.17.2024.04.29.10.05.24 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 29 Apr 2024 10:05:26 -0700 (PDT) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Jonathan Corbet , Jonathan Cameron , Dan Williams , Vishal Verma , Dave Jiang , Alexander Viro , Christian Brauner , Jan Kara , Matthew Wilcox , linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev Cc: John Groves , john@jagalactic.com, Dave Chinner , Christoph Hellwig , dave.hansen@linux.intel.com, gregory.price@memverge.com, Randy Dunlap , Jerome Glisse , Aravind Ramesh , Ajay Joshi , Eishan Mirakhur , Ravi Shankar , Srinivasulu Thanneeru , Luis Chamberlain , Amir Goldstein , Chandan Babu R , Bagas Sanjaya , "Darrick J . Wong" , Kent Overstreet , Steve French , Nathan Lynch , Michael Ellerman , Thomas Zimmermann , Julien Panis , Stanislav Fomichev , Dongsheng Yang , John Groves Subject: [RFC PATCH v2 10/12] famfs: Introduce file_operations read/write Date: Mon, 29 Apr 2024 12:04:26 -0500 Message-Id: <4584f1e26802af540a60eadb70f42c6ac5fe4679.1714409084.git.john@groves.net> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This commit introduces fs/famfs/famfs_file.c and the famfs file_operations for read/write. This is not usable yet because: * It calls dax_iomap_rw() with NULL iomap_ops (which will be introduced in a subsequent commit). * famfs_ioctl() is coming in a later commit, and it is necessary to map a file to a memory allocation. Signed-off-by: John Groves --- fs/famfs/Makefile | 2 +- fs/famfs/famfs_file.c | 122 ++++++++++++++++++++++++++++++++++++++ fs/famfs/famfs_inode.c | 2 +- fs/famfs/famfs_internal.h | 2 + 4 files changed, 126 insertions(+), 2 deletions(-) create mode 100644 fs/famfs/famfs_file.c diff --git a/fs/famfs/Makefile b/fs/famfs/Makefile index 62230bcd6793..8cac90c090a4 100644 --- a/fs/famfs/Makefile +++ b/fs/famfs/Makefile @@ -2,4 +2,4 @@ obj-$(CONFIG_FAMFS) += famfs.o -famfs-y := famfs_inode.o +famfs-y := famfs_inode.o famfs_file.o diff --git a/fs/famfs/famfs_file.c b/fs/famfs/famfs_file.c new file mode 100644 index 000000000000..48036c71d4ed --- /dev/null +++ b/fs/famfs/famfs_file.c @@ -0,0 +1,122 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * famfs - dax file system for shared fabric-attached memory + * + * Copyright 2023-2024 Micron Technology, Inc. + * + * This file system, originally based on ramfs the dax support from xfs, + * is intended to allow multiple host systems to mount a common file system + * view of dax files that map to shared memory. + */ + +#include +#include +#include +#include + +#include "famfs_internal.h" + +/********************************************************************* + * file_operations + */ + +/* Reject I/O to files that aren't in a valid state */ +static ssize_t +famfs_file_invalid(struct inode *inode) +{ + if (!IS_DAX(inode)) { + pr_debug("%s: inode %llx IS_DAX is false\n", __func__, (u64)inode); + return -ENXIO; + } + return 0; +} + +static ssize_t +famfs_rw_prep(struct kiocb *iocb, struct iov_iter *ubuf) +{ + struct inode *inode = iocb->ki_filp->f_mapping->host; + struct super_block *sb = inode->i_sb; + struct famfs_fs_info *fsi = sb->s_fs_info; + size_t i_size = i_size_read(inode); + size_t count = iov_iter_count(ubuf); + size_t max_count; + ssize_t rc; + + if (fsi->deverror) + return -ENODEV; + + rc = famfs_file_invalid(inode); + if (rc) + return rc; + + max_count = max_t(size_t, 0, i_size - iocb->ki_pos); + + if (count > max_count) + iov_iter_truncate(ubuf, max_count); + + if (!iov_iter_count(ubuf)) + return 0; + + return rc; +} + +static ssize_t +famfs_dax_read_iter(struct kiocb *iocb, struct iov_iter *to) +{ + ssize_t rc; + + rc = famfs_rw_prep(iocb, to); + if (rc) + return rc; + + if (!iov_iter_count(to)) + return 0; + + rc = dax_iomap_rw(iocb, to, NULL /*&famfs_iomap_ops */); + + file_accessed(iocb->ki_filp); + return rc; +} + +/** + * famfs_dax_write_iter() + * + * We need our own write-iter in order to prevent append + * + * @iocb: + * @from: iterator describing the user memory source for the write + */ +static ssize_t +famfs_dax_write_iter(struct kiocb *iocb, struct iov_iter *from) +{ + ssize_t rc; + + rc = famfs_rw_prep(iocb, from); + if (rc) + return rc; + + if (!iov_iter_count(from)) + return 0; + + return dax_iomap_rw(iocb, from, NULL /*&famfs_iomap_ops*/); +} + +const struct file_operations famfs_file_operations = { + .owner = THIS_MODULE, + + /* Custom famfs operations */ + .write_iter = famfs_dax_write_iter, + .read_iter = famfs_dax_read_iter, + .unlocked_ioctl = NULL /*famfs_file_ioctl*/, + .mmap = NULL /* famfs_file_mmap */, + + /* Force PMD alignment for mmap */ + .get_unmapped_area = thp_get_unmapped_area, + + /* Generic Operations */ + .fsync = noop_fsync, + .splice_read = filemap_splice_read, + .splice_write = iter_file_splice_write, + .llseek = generic_file_llseek, +}; + diff --git a/fs/famfs/famfs_inode.c b/fs/famfs/famfs_inode.c index e00e9cdecadf..490a2c0fd326 100644 --- a/fs/famfs/famfs_inode.c +++ b/fs/famfs/famfs_inode.c @@ -56,7 +56,7 @@ static struct inode *famfs_get_inode(struct super_block *sb, break; case S_IFREG: inode->i_op = &famfs_file_inode_operations; - inode->i_fop = NULL /* &famfs_file_operations */; + inode->i_fop = &famfs_file_operations; break; case S_IFDIR: inode->i_op = &famfs_dir_inode_operations; diff --git a/fs/famfs/famfs_internal.h b/fs/famfs/famfs_internal.h index 951b32ec4fbd..36efaef425e7 100644 --- a/fs/famfs/famfs_internal.h +++ b/fs/famfs/famfs_internal.h @@ -11,6 +11,8 @@ #ifndef FAMFS_INTERNAL_H #define FAMFS_INTERNAL_H +extern const struct file_operations famfs_file_operations; + struct famfs_mount_opts { umode_t mode; }; From patchwork Mon Apr 29 17:04:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Groves X-Patchwork-Id: 13647394 Received: from mail-oa1-f51.google.com (mail-oa1-f51.google.com [209.85.160.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7FB6F127E22; Mon, 29 Apr 2024 17:05:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410336; cv=none; b=NW5JJkjqHYMO3dHIPKQjHiOaTWdQVwkPHfB9ocwgw+dwT/4hN5es1pv5uNhN35HhFAWi7wd4/a70+IEoLcL5Rj+F5IIFxRcmDCu2+VfAP+NjUpVxaAILVmTRvRybuMIhK8Si3PIobcS0R4mOBdrpMc9Rp7LusTOnI6/9l+q/CJc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410336; c=relaxed/simple; bh=qYcyL0hIBedDSxOc4RRWxp74kl4dcZDEKqcuPynXMsA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=T8yWVlkUdqgf39Vign0UMW5iTW9FBD8aqA8i3sVe7NGsL6B4tKxUeoF4r3K8r8Z/WQMeHjshEYG+6n0unYVC9fyLpohOKeEVQRzqGXqKsbFa6XZMctQZE5ye8Fy+3qXZ03eRasfsGt7wwOhCHvrghKxgoctt6SSEMQn5pOaGH6I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=jzt8Wa3F; arc=none smtp.client-ip=209.85.160.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jzt8Wa3F" Received: by mail-oa1-f51.google.com with SMTP id 586e51a60fabf-22edcfcd187so1495423fac.2; Mon, 29 Apr 2024 10:05:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714410331; x=1715015131; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=yYsNSadbDL/XO8+Ce/eg3N1cTx7anq8xr5/2FvNISeg=; b=jzt8Wa3FUS6B5dMs8285D77VALgzSOwELY/bgLHcvS3bYTcfQRFly4WQGSz5FvRE6j 3BB5LGJcNTNI4gJ42MgQyzdeif2MCIPoSHd07Rwtm+MMdyeWc0XYikrWoPSheBLCDYB+ PnpWQDehA4MwLEHDIu3JA+RtT81s7FEdS+RxpkPyp86b/BnJBXFVbgyw4Lmk4nnjjdvw wC2t/pT0jxWER2oexLVBINLoSY91DPPkYDtr5ufQmAF6sRqQnm2jdGYe+Hc2DZCQ84Y8 zVE71lr74JzAb671OsIhC/SeduNXo6D45SN5uOX+PPS0oxyzhxD+UWdZ5eI2+vBol0jt V40g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714410331; x=1715015131; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=yYsNSadbDL/XO8+Ce/eg3N1cTx7anq8xr5/2FvNISeg=; b=Ko+QNqGv5pft8kgHgsCeqJZBoXaNp0TB0ny4pdY2O4Y9U1Kc3MaHdPHOCmMaFZiKHH 7K+JGDTa3PTnZqhQ9BvgeMXn484J7AfHi/MtCWkMwedva+3xl5KLVsL8hjeeIqV9zU7f MrO7CfAqsatoFJE5UGyo1HOioeGtuIGP9ypaDZOQJZ78gip3vxmrlk7LMtTkGgpU9Deg ctaCHvESPF0dO7OECdL16Dic/JR5xxvD/ua/G/LZ6yPpi+8DJpOq7BuFA5XToskbvCR4 AY4glr2OjsF3VtIJUXJdhL6g/Boe9OohFDtCuOXzRLD6PE1kF1GCg/YrPRODhCW+sNZ3 m8eg== X-Forwarded-Encrypted: i=1; AJvYcCW9jF8MEefXfPhhDkJFn4dLRZSVb591Z3FnhT2Kpvf1mjrZo8EpTWO0m19QxH5HbQ5156aESHNe6T49bK+gkp9zYilG9HHGyEYak487lOtvNu4VLzdgRNK7kU4zr9uojYV7MR1HXip+ZQ== X-Gm-Message-State: AOJu0YzIVcKejW2v4+jmw2ZqAxiqsga5YFzMnBz1Okq2Di8XwaiU2b6S 6Bak365ZbJj41KwKBz8kYz72dsG+NlXI2GyRmc63UKipVRVDdGW/ X-Google-Smtp-Source: AGHT+IHpy5rIS85sX04zYGzAy0Qk38IVj3GM56nKt2jdu2LLX/opQJ4URM6nJrxFSBqIHBiafXyrsQ== X-Received: by 2002:a05:6871:408a:b0:21f:2b1:cdea with SMTP id kz10-20020a056871408a00b0021f02b1cdeamr14600153oab.57.1714410331618; Mon, 29 Apr 2024 10:05:31 -0700 (PDT) Received: from localhost.localdomain ([70.114.203.196]) by smtp.gmail.com with ESMTPSA id g1-20020a9d6201000000b006ea20712e66sm4074448otj.17.2024.04.29.10.05.28 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 29 Apr 2024 10:05:31 -0700 (PDT) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Jonathan Corbet , Jonathan Cameron , Dan Williams , Vishal Verma , Dave Jiang , Alexander Viro , Christian Brauner , Jan Kara , Matthew Wilcox , linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev Cc: John Groves , john@jagalactic.com, Dave Chinner , Christoph Hellwig , dave.hansen@linux.intel.com, gregory.price@memverge.com, Randy Dunlap , Jerome Glisse , Aravind Ramesh , Ajay Joshi , Eishan Mirakhur , Ravi Shankar , Srinivasulu Thanneeru , Luis Chamberlain , Amir Goldstein , Chandan Babu R , Bagas Sanjaya , "Darrick J . Wong" , Kent Overstreet , Steve French , Nathan Lynch , Michael Ellerman , Thomas Zimmermann , Julien Panis , Stanislav Fomichev , Dongsheng Yang , John Groves Subject: [RFC PATCH v2 11/12] famfs: Introduce mmap and VM fault handling Date: Mon, 29 Apr 2024 12:04:27 -0500 Message-Id: <744981e208f94d5fc12549e48b775d10cee550e8.1714409084.git.john@groves.net> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This commit adds vm_operations, plus famfs_mmap() and fault handlers. It is still missing iomap_ops, iomap mapping resolution, and famfs_ioctl() for setting up file-to-memory mappings. Signed-off-by: John Groves --- fs/famfs/famfs_file.c | 108 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 106 insertions(+), 2 deletions(-) diff --git a/fs/famfs/famfs_file.c b/fs/famfs/famfs_file.c index 48036c71d4ed..585b776dd73c 100644 --- a/fs/famfs/famfs_file.c +++ b/fs/famfs/famfs_file.c @@ -16,6 +16,88 @@ #include "famfs_internal.h" +/********************************************************************* + * vm_operations + */ +static vm_fault_t +__famfs_filemap_fault(struct vm_fault *vmf, unsigned int pe_size, + bool write_fault) +{ + struct inode *inode = file_inode(vmf->vma->vm_file); + struct super_block *sb = inode->i_sb; + struct famfs_fs_info *fsi = sb->s_fs_info; + vm_fault_t ret; + pfn_t pfn; + + if (fsi->deverror) + return VM_FAULT_SIGBUS; + + if (!IS_DAX(file_inode(vmf->vma->vm_file))) { + pr_err("%s: file not marked IS_DAX!!\n", __func__); + return VM_FAULT_SIGBUS; + } + + if (write_fault) { + sb_start_pagefault(inode->i_sb); + file_update_time(vmf->vma->vm_file); + } + + ret = dax_iomap_fault(vmf, pe_size, &pfn, NULL, NULL /*&famfs_iomap_ops */); + if (ret & VM_FAULT_NEEDDSYNC) + ret = dax_finish_sync_fault(vmf, pe_size, pfn); + + if (write_fault) + sb_end_pagefault(inode->i_sb); + + return ret; +} + +static inline bool +famfs_is_write_fault(struct vm_fault *vmf) +{ + return (vmf->flags & FAULT_FLAG_WRITE) && + (vmf->vma->vm_flags & VM_SHARED); +} + +static vm_fault_t +famfs_filemap_fault(struct vm_fault *vmf) +{ + return __famfs_filemap_fault(vmf, 0, famfs_is_write_fault(vmf)); +} + +static vm_fault_t +famfs_filemap_huge_fault(struct vm_fault *vmf, unsigned int pe_size) +{ + return __famfs_filemap_fault(vmf, pe_size, famfs_is_write_fault(vmf)); +} + +static vm_fault_t +famfs_filemap_page_mkwrite(struct vm_fault *vmf) +{ + return __famfs_filemap_fault(vmf, 0, true); +} + +static vm_fault_t +famfs_filemap_pfn_mkwrite(struct vm_fault *vmf) +{ + return __famfs_filemap_fault(vmf, 0, true); +} + +static vm_fault_t +famfs_filemap_map_pages(struct vm_fault *vmf, pgoff_t start_pgoff, + pgoff_t end_pgoff) +{ + return filemap_map_pages(vmf, start_pgoff, end_pgoff); +} + +const struct vm_operations_struct famfs_file_vm_ops = { + .fault = famfs_filemap_fault, + .huge_fault = famfs_filemap_huge_fault, + .map_pages = famfs_filemap_map_pages, + .page_mkwrite = famfs_filemap_page_mkwrite, + .pfn_mkwrite = famfs_filemap_pfn_mkwrite, +}; + /********************************************************************* * file_operations */ @@ -25,7 +107,8 @@ static ssize_t famfs_file_invalid(struct inode *inode) { if (!IS_DAX(inode)) { - pr_debug("%s: inode %llx IS_DAX is false\n", __func__, (u64)inode); + pr_debug("%s: inode %llx IS_DAX is false\n", + __func__, (u64)inode); return -ENXIO; } return 0; @@ -101,6 +184,27 @@ famfs_dax_write_iter(struct kiocb *iocb, struct iov_iter *from) return dax_iomap_rw(iocb, from, NULL /*&famfs_iomap_ops*/); } +static int +famfs_file_mmap(struct file *file, struct vm_area_struct *vma) +{ + struct inode *inode = file_inode(file); + struct super_block *sb = inode->i_sb; + struct famfs_fs_info *fsi = sb->s_fs_info; + ssize_t rc; + + if (fsi->deverror) + return -ENODEV; + + rc = famfs_file_invalid(inode); + if (rc) + return (int)rc; + + file_accessed(file); + vma->vm_ops = &famfs_file_vm_ops; + vm_flags_set(vma, VM_HUGEPAGE); + return 0; +} + const struct file_operations famfs_file_operations = { .owner = THIS_MODULE, @@ -108,7 +212,7 @@ const struct file_operations famfs_file_operations = { .write_iter = famfs_dax_write_iter, .read_iter = famfs_dax_read_iter, .unlocked_ioctl = NULL /*famfs_file_ioctl*/, - .mmap = NULL /* famfs_file_mmap */, + .mmap = famfs_file_mmap, /* Force PMD alignment for mmap */ .get_unmapped_area = thp_get_unmapped_area, From patchwork Mon Apr 29 17:04:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Groves X-Patchwork-Id: 13647395 Received: from mail-ot1-f41.google.com (mail-ot1-f41.google.com [209.85.210.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 769AA127E28; Mon, 29 Apr 2024 17:05:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410343; cv=none; b=DsrLvxOnRDKxot1nxo0npkSKm/68YQClG0e1S8ggy0pg4QmjAOoqGPNK02KDaYwggU/1pU6hmeohPBSR9v+9bhfschQSh6VPOjJu1mv/ZFz7R2Ummr6xJK2sSLnU4wk5ZFrbFhhcXrH8xpiSXFoKVzmYDFjg0Z6wOrYcHbFR0rw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714410343; c=relaxed/simple; bh=uraQR7x/sGs8bT6f+ne9e8Dyo7H0N2VLJU5n4krNVvw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=QxEVcAUwh9kob5BKXG3rVIccvU+EINupez5LyBs07YklLvl2qu6eNc5qPl1anTtUqsjqY6IZA0m4dMC9XjPe8Jqr+Cb6wEvJDVid9wQqkklse9pK8QA+xSOewotXzkVeSHsBWdj8dRueV5ZVb30Xy39qqTmxJM+2hu/UNR0fFQo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=RDVzS2ar; arc=none smtp.client-ip=209.85.210.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="RDVzS2ar" Received: by mail-ot1-f41.google.com with SMTP id 46e09a7af769-6ee2d64423cso897411a34.2; Mon, 29 Apr 2024 10:05:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1714410336; x=1715015136; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=8XGfzHhfU9BjKWtreEiEkkMsTodSRwxrpWg7XVozVVs=; b=RDVzS2arTu/qw1xVZ8ug+4dXKoh05AWdEltOpcxtJzXAOMptOAV131uhHYJvqIaa5o Ov4YttYLRG5PptkbKlL4WUZ6r+sWeIIWSWS5Z1irxmTMDyY20QN3IE3CdJcZvDkkCbvv xROiXk3eSZ7eo6g8KaCdqxXDSoYTRrudFv0gV/XAx3YN6vjeZEBFYmwpl9p8zjweJGFI KG31QFkZPk44yjR4Sk2VFqN28LbjChhbIVoIqGTf1gUkCFwgpQ6+Qd2+o1A/EOvzhrJo vGPXAujgR+WqxprdaKw+ix6QWIjXI3ThY0jwMcVxFMkxsbPBA1ZBEfBmgZ3w0nlRTgEL Eqhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714410336; x=1715015136; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=8XGfzHhfU9BjKWtreEiEkkMsTodSRwxrpWg7XVozVVs=; b=PhlVlxUnv3GG4nCs5qBkuVM7K8oVWxkOpQN25xenNqbzqnoI4dqj4y1t1kTtHDA4eK hqVhuN4Ybopd0Vlhd5Z3atWUpAMZ/cAmlGkjIW71gppierNh6kPiU8gV3IwNSIhnvqpR LY6B+563mg0MWqpt3PADPEAf2c+X+abIa9DOGXJLwrbxjfd3QNr/byoWFg7McaUOGMKI Ocuv5dsdPtLwrj4XjgdvlNIMsLe5xQ7hZ4nlffTUqQH9fa4J5oT9T76lJGGSEFHmYdXk MJ+v1sKx8Wp+3FyfqIM2XRbXMmXV9wLNelbo7kcgNUStr+n7GEwh1iALVbF4eAlWTxNs NMkw== X-Forwarded-Encrypted: i=1; AJvYcCX1W0SrWGOZGykH/zaKOu+z7jKpNqv2YTpKT/DHk95Pf4MRMfjs+bzl2TgJaavkvM2ooIIhdnk8tnp5jIaMGBPpQCryz6jCzUvL9ZLIFgI3GRWbYqtUrP7kB6HhZm78ycGwKECHmvWHHw== X-Gm-Message-State: AOJu0YwMgEyVClaqu9JcmyHJBcpkYHo4TQXLxQQ1nJ3CYj2ZbeMDemPl 9Ra5JWHPUrPS/lSVzjFnlTWdfLFFbzOb7DhOJCK1mSsa0R8WsEmV X-Google-Smtp-Source: AGHT+IFa4ttTfbKQGFpDgkeFkiAf/nTCdj5A92xX4+XR8LBf8Q0ctWa5/uMWEr1Ye2nwSdUIywl4GQ== X-Received: by 2002:a9d:6a11:0:b0:6ee:2798:4b95 with SMTP id g17-20020a9d6a11000000b006ee27984b95mr4654535otn.10.1714410335740; Mon, 29 Apr 2024 10:05:35 -0700 (PDT) Received: from localhost.localdomain ([70.114.203.196]) by smtp.gmail.com with ESMTPSA id g1-20020a9d6201000000b006ea20712e66sm4074448otj.17.2024.04.29.10.05.33 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 29 Apr 2024 10:05:35 -0700 (PDT) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Jonathan Corbet , Jonathan Cameron , Dan Williams , Vishal Verma , Dave Jiang , Alexander Viro , Christian Brauner , Jan Kara , Matthew Wilcox , linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev Cc: John Groves , john@jagalactic.com, Dave Chinner , Christoph Hellwig , dave.hansen@linux.intel.com, gregory.price@memverge.com, Randy Dunlap , Jerome Glisse , Aravind Ramesh , Ajay Joshi , Eishan Mirakhur , Ravi Shankar , Srinivasulu Thanneeru , Luis Chamberlain , Amir Goldstein , Chandan Babu R , Bagas Sanjaya , "Darrick J . Wong" , Kent Overstreet , Steve French , Nathan Lynch , Michael Ellerman , Thomas Zimmermann , Julien Panis , Stanislav Fomichev , Dongsheng Yang , John Groves Subject: [RFC PATCH v2 12/12] famfs: famfs_ioctl and core file-to-memory mapping logic & iomap_ops Date: Mon, 29 Apr 2024 12:04:28 -0500 Message-Id: <5824030d31a853ff591b3e1fb4f206b2fd4d1f9f.1714409084.git.john@groves.net> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 * Add uapi include file famfs_ioctl.h. The famfs user space uses ioctl on individual files to pass in mapping information and file size. This would be hard to do via sysfs or other means, since it's file-specific. * Add the per-file ioctl function famfs_file_ioctl() into struct file_operations, and introduces the famfs_file_init_dax() function (which is called by famfs_file_ioct()) * Add the famfs iomap_ops. When either dax_iomap_fault() or dax_iomap_rw() is called, we get a callback via our iomap_begin() handler. The question being asked is "please resolve (file, offset) to (daxdev, offset)". The function famfs_meta_to_dax_offset() does this. * Expose the famfs ABI version as /sys/module/famfs/parameters/famfs_kabi_version The current ioctls are: FAMFS_IOC_MAP_CREATE - famfs_file_init_dax() associates a dax extent list with a file, making it into a proper famfs file.Starting with an empty file (which is not useful), This turns the file into a DAX file backed by the specified extent list from devdax memory. FAMFSIOC_NOP - A convenient way for user space to verify it's a famfs file FAMFSIOC_MAP_GET - Get the header of the metadata for a file FAMFSIOC_MAP_GETEXT - Get the extents for a file The last two, together, are comparable to xfs_bmap. Our user space tools use them primarly in testing. Signed-off-by: John Groves --- MAINTAINERS | 1 + fs/famfs/famfs_file.c | 391 ++++++++++++++++++++++++++++++- fs/famfs/famfs_internal.h | 14 ++ include/uapi/linux/famfs_ioctl.h | 61 +++++ 4 files changed, 461 insertions(+), 6 deletions(-) create mode 100644 include/uapi/linux/famfs_ioctl.h diff --git a/MAINTAINERS b/MAINTAINERS index 365d678e2f40..29d81be488bc 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8189,6 +8189,7 @@ L: linux-fsdevel@vger.kernel.org S: Supported F: Documentation/filesystems/famfs.rst F: fs/famfs +F: include/uapi/linux/famfs_ioctl.h FANOTIFY M: Jan Kara diff --git a/fs/famfs/famfs_file.c b/fs/famfs/famfs_file.c index 585b776dd73c..ac34e606ca1b 100644 --- a/fs/famfs/famfs_file.c +++ b/fs/famfs/famfs_file.c @@ -14,8 +14,371 @@ #include #include +#include #include "famfs_internal.h" +/* Expose famfs kernel abi version as a read-only module parameter */ +static int famfs_kabi_version = FAMFS_KABI_VERSION; +module_param(famfs_kabi_version, int, 0444); +MODULE_PARM_DESC(famfs_kabi_version, "famfs kernel abi version"); + +/** + * famfs_meta_alloc() - Allocate famfs file metadata + * @metap: Pointer to an mcache_map_meta pointer + * @ext_count: The number of extents needed + */ +static int +famfs_meta_alloc(struct famfs_file_meta **metap, size_t ext_count) +{ + struct famfs_file_meta *meta; + + meta = kzalloc(struct_size(meta, tfs_extents, ext_count), GFP_KERNEL); + if (!meta) + return -ENOMEM; + + meta->tfs_extent_ct = ext_count; + meta->error = false; + *metap = meta; + + return 0; +} + +static void +famfs_meta_free(struct famfs_file_meta *map) +{ + kfree(map); +} + +/** + * famfs_file_init_dax() - FAMFSIOC_MAP_CREATE ioctl handler + * @file: the un-initialized file + * @arg: ptr to struct mcioc_map in user space + * + * Setup the dax mapping for a file. Files are created empty, and then function + * is called by famfs_file_ioctl() to setup the mapping and set the file size. + */ +static int +famfs_file_init_dax(struct file *file, void __user *arg) +{ + struct famfs_file_meta *meta = NULL; + struct famfs_ioc_map imap; + struct famfs_fs_info *fsi; + size_t extent_total = 0; + int alignment_errs = 0; + struct super_block *sb; + struct inode *inode; + size_t ext_count; + int rc; + int i; + + inode = file_inode(file); + if (!inode) { + rc = -EBADF; + goto errout; + } + + sb = inode->i_sb; + fsi = sb->s_fs_info; + if (fsi->deverror) + return -ENODEV; + + rc = copy_from_user(&imap, arg, sizeof(imap)); + if (rc) + return -EFAULT; + + ext_count = imap.ext_list_count; + if (ext_count < 1) { + rc = -ENOSPC; + goto errout; + } + + if (ext_count > FAMFS_MAX_EXTENTS) { + rc = -E2BIG; + goto errout; + } + + rc = famfs_meta_alloc(&meta, ext_count); + if (rc) + goto errout; + + meta->file_type = imap.file_type; + meta->file_size = imap.file_size; + + /* Fill in the internal file metadata structure */ + for (i = 0; i < imap.ext_list_count; i++) { + size_t len; + off_t offset; + + offset = imap.ext_list[i].offset; + len = imap.ext_list[i].len; + + extent_total += len; + + if (WARN_ON(offset == 0 && meta->file_type != FAMFS_SUPERBLOCK)) { + rc = -EINVAL; + goto errout; + } + + meta->tfs_extents[i].offset = offset; + meta->tfs_extents[i].len = len; + + /* All extent addresses/offsets must be 2MiB aligned, + * and all but the last length must be a 2MiB multiple. + */ + if (!IS_ALIGNED(offset, PMD_SIZE)) { + pr_err("%s: error ext %d hpa %lx not aligned\n", + __func__, i, offset); + alignment_errs++; + } + if (i < (imap.ext_list_count - 1) && !IS_ALIGNED(len, PMD_SIZE)) { + pr_err("%s: error ext %d length %ld not aligned\n", + __func__, i, len); + alignment_errs++; + } + } + + /* + * File size can be <= ext list size, since extent sizes are constrained + * to PMD multiples + */ + if (imap.file_size > extent_total) { + pr_err("%s: file size %lld larger than ext list size %lld\n", + __func__, (u64)imap.file_size, (u64)extent_total); + rc = -EINVAL; + goto errout; + } + + if (alignment_errs > 0) { + pr_err("%s: there were %d alignment errors in the extent list\n", + __func__, alignment_errs); + rc = -EINVAL; + goto errout; + } + + /* Publish the famfs metadata on inode->i_private */ + inode_lock(inode); + if (inode->i_private) { + rc = -EEXIST; /* file already has famfs metadata */ + } else { + inode->i_private = meta; + i_size_write(inode, imap.file_size); + inode->i_flags |= S_DAX; + } + inode_unlock(inode); + + errout: + if (rc) + famfs_meta_free(meta); + + return rc; +} + +/** + * famfs_file_ioctl() - Top-level famfs file ioctl handler + * @file: the file + * @cmd: ioctl opcode + * @arg: ioctl opcode argument (if any) + */ +static long +famfs_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg) +{ + struct inode *inode = file_inode(file); + struct famfs_fs_info *fsi = inode->i_sb->s_fs_info; + long rc; + + if (fsi->deverror && (cmd != FAMFSIOC_NOP)) + return -ENODEV; + + switch (cmd) { + case FAMFSIOC_NOP: + rc = 0; + break; + + case FAMFSIOC_MAP_CREATE: + rc = famfs_file_init_dax(file, (void *)arg); + break; + + case FAMFSIOC_MAP_GET: { + struct inode *inode = file_inode(file); + struct famfs_file_meta *meta = inode->i_private; + struct famfs_ioc_map umeta; + + memset(&umeta, 0, sizeof(umeta)); + + if (meta) { + /* TODO: do more to harmonize these structures */ + umeta.extent_type = meta->tfs_extent_type; + umeta.file_size = i_size_read(inode); + umeta.ext_list_count = meta->tfs_extent_ct; + + rc = copy_to_user((void __user *)arg, &umeta, + sizeof(umeta)); + if (rc) + pr_err("%s: copy_to_user returned %ld\n", + __func__, rc); + + } else { + rc = -EINVAL; + } + break; + } + case FAMFSIOC_MAP_GETEXT: { + struct inode *inode = file_inode(file); + struct famfs_file_meta *meta = inode->i_private; + + if (meta) + rc = copy_to_user((void __user *)arg, meta->tfs_extents, + meta->tfs_extent_ct * sizeof(struct famfs_extent)); + else + rc = -EINVAL; + break; + } + default: + rc = -ENOTTY; + break; + } + + return rc; +} + +/********************************************************************* + * iomap_operations + * + * This stuff uses the iomap (dax-related) helpers to resolve file offsets to + * offsets within a dax device. + */ + +static ssize_t famfs_file_invalid(struct inode *inode); + +/** + * famfs_meta_to_dax_offset() - Resolve (file, offset, len) to (daxdev, offset, len) + * + * This function is called by famfs_iomap_begin() to resolve an offset in a + * file to an offset in a dax device. This is upcalled from dax from calls to + * both * dax_iomap_fault() and dax_iomap_rw(). Dax finishes the job resolving + * a fault to a specific physical page (the fault case) or doing a memcpy + * variant (the rw case) + * + * Pages can be PTE (4k), PMD (2MiB) or (theoretically) PuD (1GiB) + * (these sizes are for X86; may vary on other cpu architectures + * + * @inode: The file where the fault occurred + * @iomap: To be filled in to indicate where to find the right memory, + * relative to a dax device. + * @file_offset: Within the file where the fault occurred (will be page boundary) + * @len: The length of the faulted mapping (will be a page multiple) + * (will be trimmed in *iomap if it's disjoint in the extent list) + * @flags: + * + * Return values: 0. (info is returned in a modified @iomap struct) + */ +static int +famfs_meta_to_dax_offset(struct inode *inode, struct iomap *iomap, + loff_t file_offset, off_t len, unsigned int flags) +{ + struct famfs_file_meta *meta = inode->i_private; + int i; + loff_t local_offset = file_offset; + struct famfs_fs_info *fsi = inode->i_sb->s_fs_info; + + if (fsi->deverror || famfs_file_invalid(inode)) + goto err_out; + + iomap->offset = file_offset; + + for (i = 0; i < meta->tfs_extent_ct; i++) { + loff_t dax_ext_offset = meta->tfs_extents[i].offset; + loff_t dax_ext_len = meta->tfs_extents[i].len; + + if ((dax_ext_offset == 0) && + (meta->file_type != FAMFS_SUPERBLOCK)) + pr_warn("%s: zero offset on non-superblock file!!\n", + __func__); + + /* local_offset is the offset minus the size of extents skipped + * so far; If local_offset < dax_ext_len, the data of interest + * starts in this extent + */ + if (local_offset < dax_ext_len) { + loff_t ext_len_remainder = dax_ext_len - local_offset; + + /* + * OK, we found the file metadata extent where this + * data begins + * @local_offset - The offset within the current + * extent + * @ext_len_remainder - Remaining length of ext after + * skipping local_offset + * Outputs: + * iomap->addr: the offset within the dax device where + * the data starts + * iomap->offset: the file offset + * iomap->length: the valid length resolved here + */ + iomap->addr = dax_ext_offset + local_offset; + iomap->offset = file_offset; + iomap->length = min_t(loff_t, len, ext_len_remainder); + iomap->dax_dev = fsi->dax_devp; + iomap->type = IOMAP_MAPPED; + iomap->flags = flags; + + return 0; + } + local_offset -= dax_ext_len; /* Get ready for the next extent */ + } + + err_out: + /* We fell out the end of the extent list. + * Set iomap to zero length in this case, and return 0 + * This just means that the r/w is past EOF + */ + iomap->addr = 0; /* there is no valid dax device offset */ + iomap->offset = file_offset; /* file offset */ + iomap->length = 0; /* this had better result in no access to dax mem */ + iomap->dax_dev = fsi->dax_devp; + iomap->type = IOMAP_MAPPED; + iomap->flags = flags; + + return 0; +} + +/** + * famfs_iomap_begin() - Handler for iomap_begin upcall from dax + * + * This function is pretty simple because files are + * * never partially allocated + * * never have holes (never sparse) + * * never "allocate on write" + * + * @inode: inode for the file being accessed + * @offset: offset within the file + * @length: Length being accessed at offset + * @flags: + * @iomap: iomap struct to be filled in, resolving (offset, length) to + * (daxdev, offset, len) + * @srcmap: + */ +static int +famfs_iomap_begin(struct inode *inode, loff_t offset, loff_t length, + unsigned int flags, struct iomap *iomap, struct iomap *srcmap) +{ + struct famfs_file_meta *meta = inode->i_private; + size_t size; + + size = i_size_read(inode); + + WARN_ON(size != meta->file_size); + + return famfs_meta_to_dax_offset(inode, iomap, offset, length, flags); +} + +/* Note: We never need a special set of write_iomap_ops because famfs never + * performs allocation on write. + */ +const struct iomap_ops famfs_iomap_ops = { + .iomap_begin = famfs_iomap_begin, +}; + /********************************************************************* * vm_operations */ @@ -42,7 +405,7 @@ __famfs_filemap_fault(struct vm_fault *vmf, unsigned int pe_size, file_update_time(vmf->vma->vm_file); } - ret = dax_iomap_fault(vmf, pe_size, &pfn, NULL, NULL /*&famfs_iomap_ops */); + ret = dax_iomap_fault(vmf, pe_size, &pfn, NULL, &famfs_iomap_ops); if (ret & VM_FAULT_NEEDDSYNC) ret = dax_finish_sync_fault(vmf, pe_size, pfn); @@ -106,9 +469,25 @@ const struct vm_operations_struct famfs_file_vm_ops = { static ssize_t famfs_file_invalid(struct inode *inode) { + struct famfs_file_meta *meta = inode->i_private; + size_t i_size = i_size_read(inode); + + if (!meta) { + pr_debug("%s: un-initialized famfs file\n", __func__); + return -EIO; + } + if (meta->error) { + pr_debug("%s: previously detected metadata errors\n", __func__); + return -EIO; + } + if (i_size != meta->file_size) { + pr_warn("%s: i_size overwritten from %ld to %ld\n", + __func__, meta->file_size, i_size); + meta->error = true; + return -ENXIO; + } if (!IS_DAX(inode)) { - pr_debug("%s: inode %llx IS_DAX is false\n", - __func__, (u64)inode); + pr_debug("%s: inode %llx IS_DAX is false\n", __func__, (u64)inode); return -ENXIO; } return 0; @@ -155,7 +534,7 @@ famfs_dax_read_iter(struct kiocb *iocb, struct iov_iter *to) if (!iov_iter_count(to)) return 0; - rc = dax_iomap_rw(iocb, to, NULL /*&famfs_iomap_ops */); + rc = dax_iomap_rw(iocb, to, &famfs_iomap_ops); file_accessed(iocb->ki_filp); return rc; @@ -181,7 +560,7 @@ famfs_dax_write_iter(struct kiocb *iocb, struct iov_iter *from) if (!iov_iter_count(from)) return 0; - return dax_iomap_rw(iocb, from, NULL /*&famfs_iomap_ops*/); + return dax_iomap_rw(iocb, from, &famfs_iomap_ops); } static int @@ -211,7 +590,7 @@ const struct file_operations famfs_file_operations = { /* Custom famfs operations */ .write_iter = famfs_dax_write_iter, .read_iter = famfs_dax_read_iter, - .unlocked_ioctl = NULL /*famfs_file_ioctl*/, + .unlocked_ioctl = famfs_file_ioctl, .mmap = famfs_file_mmap, /* Force PMD alignment for mmap */ diff --git a/fs/famfs/famfs_internal.h b/fs/famfs/famfs_internal.h index 36efaef425e7..a45757d4cdea 100644 --- a/fs/famfs/famfs_internal.h +++ b/fs/famfs/famfs_internal.h @@ -11,8 +11,22 @@ #ifndef FAMFS_INTERNAL_H #define FAMFS_INTERNAL_H +#include + extern const struct file_operations famfs_file_operations; +/* + * Each famfs dax file has this hanging from its inode->i_private. + */ +struct famfs_file_meta { + bool error; + enum famfs_file_type file_type; + size_t file_size; + enum famfs_extent_type tfs_extent_type; + size_t tfs_extent_ct; + struct famfs_extent tfs_extents[]; +}; + struct famfs_mount_opts { umode_t mode; }; diff --git a/include/uapi/linux/famfs_ioctl.h b/include/uapi/linux/famfs_ioctl.h new file mode 100644 index 000000000000..97ff5a2a8d13 --- /dev/null +++ b/include/uapi/linux/famfs_ioctl.h @@ -0,0 +1,61 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +/* + * famfs - dax file system for shared fabric-attached memory + * + * Copyright 2023-2024 Micron Technology, Inc. + * + * This file system, originally based on ramfs the dax support from xfs, + * is intended to allow multiple host systems to mount a common file system + * view of dax files that map to shared memory. + */ +#ifndef FAMFS_IOCTL_H +#define FAMFS_IOCTL_H + +#include +#include + +#define FAMFS_KABI_VERSION 42 +#define FAMFS_MAX_EXTENTS 2 + +/* We anticipate the possiblity of supporting additional types of extents */ +enum famfs_extent_type { + SIMPLE_DAX_EXTENT, + INVALID_EXTENT_TYPE, +}; + +struct famfs_extent { + __u64 offset; + __u64 len; +}; + +enum famfs_file_type { + FAMFS_REG, + FAMFS_SUPERBLOCK, + FAMFS_LOG, +}; + +/** + * struct famfs_ioc_map - the famfs per-file metadata structure + * @extent_type: what type of extents are in this ext_list + * @file_type: Mark the superblock and log as special files. Maybe more later. + * @file_size: Size of the file, which is <= the size of the ext_list + * @ext_list_count: Number of extents + * @ext_list: 1 or more extents + */ +struct famfs_ioc_map { + enum famfs_extent_type extent_type; + enum famfs_file_type file_type; + __u64 file_size; + __u64 ext_list_count; + struct famfs_extent ext_list[FAMFS_MAX_EXTENTS]; +}; + +#define FAMFSIOC_MAGIC 'u' + +/* famfs file ioctl opcodes */ +#define FAMFSIOC_MAP_CREATE _IOW(FAMFSIOC_MAGIC, 0x50, struct famfs_ioc_map) +#define FAMFSIOC_MAP_GET _IOR(FAMFSIOC_MAGIC, 0x51, struct famfs_ioc_map) +#define FAMFSIOC_MAP_GETEXT _IOR(FAMFSIOC_MAGIC, 0x52, struct famfs_extent) +#define FAMFSIOC_NOP _IO(FAMFSIOC_MAGIC, 0x53) + +#endif /* FAMFS_IOCTL_H */