From patchwork Thu Jun 27 00:54:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13713646 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 844A9C30659 for ; Thu, 27 Jun 2024 00:55:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:MIME-Version:Content-Type: Content-Transfer-Encoding:Message-ID:Date:Subject:Cc:To:From:Reply-To: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Owner; bh=3Rf2PZanTBx44cI2D5MZlIPtE+H6xP1K3YVbhjJ4DOQ=; b=MRFCuQush0GedZvOKAYk8fAbw+ iHg11Wv0sQnGmb7mhpHk4nz63M5WuSorqNX74WzHbO3s/NQxXp4fyuxwPOmMDIci7wVYx1R45MM9l U9EoOdfZ4gnQsoZPz93F1h18LUx3CS6Zxs8qr3Sck8fERpHwt0R9t+/zMWJVtkUtMvl49xsakA6J+ kn7pEA/cW18umBE56HNVxdes1p3BzMo3r0XlDF504muwm98BbXMZtQYOkRMydNed1lTYFz+T+G263 QowI+m/Rx0T6uMYhI0V6IDShAmloRqMz60QMJok71QKj8jkogwvhWEN0bNf0PzRvQnYdYldFbSkx4 aR1oOqwA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sMdPe-00000008mlI-350q; Thu, 27 Jun 2024 00:54:58 +0000 Received: from mail-co1nam11on20600.outbound.protection.outlook.com ([2a01:111:f403:2416::600] helo=NAM11-CO1-obe.outbound.protection.outlook.com) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sMdPU-00000008meE-47bH for linux-arm-kernel@lists.infradead.org; Thu, 27 Jun 2024 00:54:51 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=N2wkBygywzYfYegHAeeHzVaf+UrSsjPaSz2mhW9RFNdQNE3mwevycZ1tsBSHqNYgsdFZmTf+vIQflqIQeb0oTxktftOtQiTEykm+izvxGc8kZ89wFD3HReV8azCCpcJBrPifNDi4hHa+bCIGkyJF9rGzWUshyEYvVhpvNHCtoTes1pOilQnkFlHzyRECokjDsOGpK4oKR4dMbvLiDP+qVr/f3LswrfvloYWuUlp8RlLI6vXBBl52kZshaL+xOGEmAZXPHyiyvtTyqZ9Kh/Ql7ur04doos3syVhwnc7sOfwLeHVXhLQ7voIdQCU1PJoxPLsVc6UXpln7PZlQZTZ2v8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3Rf2PZanTBx44cI2D5MZlIPtE+H6xP1K3YVbhjJ4DOQ=; b=RWNvXrsVR1aElfT56+CtVLXwi2RgqHa2UFIfZceJ0czjSyIs07ATzz3l77QIM6dyTTW2W2VUm3LyILpJm0vxIcbcYmOFlwFZlfgWWADJM6QYeuLkK4aKCDRU4Xbgw1GxEX6KOddVQaOPJAAJT+5cCqZWP16RSeQhs4+mEflS4E6Aalsvxt5nw5yQkdUfXZ3K3g/2CKLCk3pe7WTeknZCJIlBmgbEfLhiWjKf8tupexu1z5WiAUxOuKad+ALC93BGQAaeu8dkHqwcfrxAqB7L3woKGreqypulTsfC3ddcbSqzVLIvb3+J9S3ATbfCtp54lVBoRFnqc4yx5LzaqEyfxQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3Rf2PZanTBx44cI2D5MZlIPtE+H6xP1K3YVbhjJ4DOQ=; b=hUqerUaxb4S2LOlLLGWzrCwt/cgOVEnCUDh0vPo7E/OcFm1IKfarUsLI1aa1CxNscdglN/m0k/OTqxtEDxKuKvakGHjpkspU/dut/0sh905tGcQ0SOjP/EQFFN5RqTCvG504+AfVzS9h9Z+L7Y6BmvaS31RC7kWsPqePHC8rHiQz6gbPwc5xa6fwbdUGDy0lg8L+n1gUb2QufZKhptLvOfM+CfNBSKdy8I2qKdUjj0bG+yaijZSn9PcZOh6TvMo7fxe7dQlgvVZZxhrR+T1dVfwWGzJFodsTWqgOkitEiVTU5XTFdmk+/RjnFbFIyXEHDKLKfF6LPCtEFSsGg9904w== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) by MW6PR12MB7071.namprd12.prod.outlook.com (2603:10b6:303:238::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7698.32; Thu, 27 Jun 2024 00:54:35 +0000 Received: from DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe]) by DS0PR12MB7726.namprd12.prod.outlook.com ([fe80::953f:2f80:90c5:67fe%6]) with mapi id 15.20.7698.025; Thu, 27 Jun 2024 00:54:35 +0000 From: Alistair Popple To: dan.j.williams@intel.com, vishal.l.verma@intel.com, dave.jiang@intel.com, logang@deltatee.com, bhelgaas@google.com, jack@suse.cz, jgg@ziepe.ca Cc: catalin.marinas@arm.com, will@kernel.org, mpe@ellerman.id.au, npiggin@gmail.com, dave.hansen@linux.intel.com, ira.weiny@intel.com, willy@infradead.org, djwong@kernel.org, tytso@mit.edu, linmiaohe@huawei.com, david@redhat.com, peterx@redhat.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, jhubbard@nvidia.com, hch@lst.de, david@fromorbit.com, Alistair Popple Subject: [PATCH 00/13] fs/dax: Fix FS DAX page reference counts Date: Thu, 27 Jun 2024 10:54:15 +1000 Message-ID: X-Mailer: git-send-email 2.43.0 X-ClientProxiedBy: SYBPR01CA0173.ausprd01.prod.outlook.com (2603:10c6:10:52::17) To DS0PR12MB7726.namprd12.prod.outlook.com (2603:10b6:8:130::6) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR12MB7726:EE_|MW6PR12MB7071:EE_ X-MS-Office365-Filtering-Correlation-Id: 2efb801c-5fef-4b79-977b-08dc9643b679 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|376014|7416014|1800799024; X-Microsoft-Antispam-Message-Info: HcCsiw7fou7/iYLx2Y1PwLSdh1RhR/Qw60R6qRw1IWuiVr9wtI2U5MgdlJ1s0p0SU5jtcnE+qlS1ieUQzNyLJu2ea/ITYV2/rlKnwpQ/XbaQgJgKPU8s4vE5DVFDeZjLD9AKEIK6w0KIZNs3oacjn1RxdQfSuIpgTrRi19TXR5BYgJmi5W7SEAPccqN3LLZ2rvewdW26u9Gwagu1Y1nfrE+uu2WY8Hw5kr65fu76w+BnTu6PfrEcxiTbwqaUX+RU1LVGpPRdB7o/YMZ5FTLioSdm7/+qoBlMf9gXobTwkq5+rGBDWMM1keGS16kEibs6uAwJK7a4NBY0z3CCSqDL/0WyhFcI5UDLTT/JIZmUueCXxAhnn/3YbyU6Jshl38S5ghg/3HDXtP2WvcWDHxYnrQ6N5UsSZbp/Tx97lOJSuxUeaOWe6Hj/4KUfkEvdl/Qe899g0htO6+XJBIs70A41NlIuSo2Sa4HvZgQvIBIEAiCTxZOXzYIATsWTZ92Z4sZ2aSlM2c35A1dFMuCK29IutVIuJE/zrRqq7EkjlttMRqhYYAvuEZ2DwuTdsz8ENJci3ASmBGsyIHn+3hTdfE7MWJIGXZL+hvamLvYfGkhnEzMs7zRGnvObooWAtapvRFP9sabHxWM4r9U5ea0bAUC4JZbjvV6+iBoJZNEIRkQhX6FokRvM0oq5oZ9aOP31irLea/VSboa/cKREZT6K+boxk7ZIfQkPGqwoco/rnUa04FUEc82nGqd9kMXGUAT7gUdPcDZ+QDdC4wfT80CuA8IRcX9tAsO/hDFC59rHLga2rVkqz+lc3WO39vnJCGV2oKZIYFXKoy93qS6iAfOVj4e5O3g5ugEINXASY1YuXcwW3fIvS6NWYwl6L8RuY8xehTy/VVY6mzQe8gU2667fYeSd1MiAHhVZrrRkJ6UJT1c3WROyjCtXhcNcE//omZxHUKWDyN/GKRLdTPLaXVQ7o+ZZZfoLuLqTzAvljMQGVhiXEFbxXnwwUR2ACymVfqJc48RCgovgQj7Kv8gJI8Koo0LDBCmnQekG2w8IyA4hXYIT2Pl6ehJ/1gxGGY9qVN0wfmfqu1qI0frPljGcqo+a8ccE59kz1sRGatek9okD76lhAY2J2VZwuL3d1Xw4Rpby3ieGDeHXweYWEEXlwLf6jyHQWyQ6rziGu00DZvqNTeDbJgH6GHnYTPDNpwmf3nMMc5XZR2hZwneE05QuEcdfmD8RI7u1Zx+joabqQVs6aRj8/3iksNISsEJPYj1HsamGnf6K8Knsfzyxi0jTaFalnlNIryxebIfG3m45mty2MNv86TzyQ9q3UzJk5xUMX8/u9IJab9/hjRIOEdeHvuDMqb1VJpaey2QnBEFXPsy6sbufp8Y= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS0PR12MB7726.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(376014)(7416014)(1800799024);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: wfnBwgDI1xVGz07qI+AIeXifWjeFX1S2aaMpuLKATbcIVklQnNHL/YoTzOlJMJ7cgeJjphsqelnWDZgkNFeD1U1Xssp6GPH0tfHD+DTRIi0PvVPOAj7LskkQpxIaCK7+YlDmatWrRQilGVSh6ELHzBRKTC9U1dqXZuF0x8JOTZ4h1LuG5gjXISOYO8jlmNGVy8F6e0aLvJuNYwEv40PnHl0Y65sO96ShHDnmw2mewgQTefUUc9Hd0Gjp1ekfOROGS5GWFBrquKIoecUcMiPv3q48lKKOmruoGa6+IEleGVeerwk0Waa3cF4+aX9PphjuyOpRKBPod7JjhXA/za96yPQRJtZ3vR5K7/20SiJjOTAVNmDd3ydpFxge20u72uoMwQinhhThCJ3Pf+LpE8Qe0142Dq0Stnyap7+luG9ymbsUQWOhHsZcS7emlA2KkfWueIOQP+pvm+z9Amy/UONgOpZSSPdRWnhAc+TlzCM5/1CeT7C5HOdj6K+k5KEGY95Bn9bWmxTjXedM6z7tj/4U/6q7jQ8I+Ss0EzQ3QUSrNhWOjAqqxYN6MuNiMzzSSZf8zG7PYYbOsmuKBVnDFNvZFbatRofUCasAlpUJZLQZ8x7xVSuWCJ2zn1IS48acsrORhZW8tzvOILaRNqrkcWrLFczw39boTHFgfgtqWpAbHOfJJvYS+7FMihX5iNNVjBi6UPresYMmUmT5vsK0myMXjEui4Ubt9eLC9okbqLFwqdpqS+t57fl93vQTdWhBUUDeaOGZtN6VSQ/ZwjACAT0SKU0Kf17DHQJhk+8giv1TArWqt7agRiqAE0T+mYRhUhhRp8ERNB5BNVgShslvqmu7chkKun27U1k7exxetX4vAc25mTLkOv6iny3S+93r5rY4FjMVL+E7o+DZ/WPEwDMlC2oTJlJHH2z0OKflhMc+aJAU1UEtmT93BGgUQQPLwpI1n6e8WMEeRQX3U/iRVhKGudNq5DSMroxMebqbO3bv+UaVQpFS9GMplKJAhE9SyujkWLtYeLx1bI3X5Q2wFQiZ3S04J9xcM7NpIlOXfj+FxtYP4ngQYeNHmG40Kp6iNHQPoD6ALqXUMzuo4ExPRgZOgQhzxNz4DQxr9NRpsl/6VUkFtGl3xtjm9QznlGzJBRzjHrYAzL6fPYDv2pYIRZHjgLimT+A+ZukKh+lsl0pZzlnR8NMr7safbpPHEThi2loM9BJ7WkXZpGa8+CShqpdZMsEEzASRZprffOgM97o55poEYR04KRWS5VXpauuKxvMkDwGVCKRGD4W/6+LJm6MwOus1P8GxCywvgVAtm9skorfDDri9TwOIKj+kk6n12HqGUkpPiY07fIoc473OEymgJ9zC1nNN1mvlMycoNaCyhim8Ygv7R5KRylE83QACtGK806n6WmguJt39UR/IN/iXotU56D9BzUFtMKFKEDS1/hfP5IuT8IuiNf8ZPA6vmoPyXfq/KsbS4wrbR+QhU4vWGJ9v3uzgoYA9wWnMKvANsyjj7Gz6essDDsL9XJwYWP7oArrnJhjZpBV6p0BHcezJUOpXvNSdpp9/KkrJF9ufodxBI7EdK505DXCFvfRqC5zq X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2efb801c-5fef-4b79-977b-08dc9643b679 X-MS-Exchange-CrossTenant-AuthSource: DS0PR12MB7726.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 27 Jun 2024 00:54:35.1225 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: wDicBmJR/cEzIP01fYYQ9y4EEDrQ4SqaBrInj+/ZsKrktjNqFYXJdJVCBTx0lqQn+tW2IsbN+blFBQLWj8ay8g== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW6PR12MB7071 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240626_175449_054759_3714FE62 X-CRM114-Status: GOOD ( 14.53 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org FS DAX pages have always maintained their own page reference counts without following the normal rules for page reference counting. In particular pages are considered free when the refcount hits one rather than zero and refcounts are not added when mapping the page. Tracking this requires special PTE bits (PTE_DEVMAP) and a secondary mechanism for allowing GUP to hold references on the page (see get_dev_pagemap). However there doesn't seem to be any reason why FS DAX pages need their own reference counting scheme. By treating the refcounts on these pages the same way as normal pages we can remove a lot of special checks. In particular pXd_trans_huge() becomes the same as pXd_leaf(), although I haven't made that change here. It also frees up a valuable SW define PTE bit on architectures that have devmap PTE bits defined. It also almost certainly allows further clean-up of the devmap managed functions, but I have left that as a future improvment. This is an update to the original RFC rebased onto v6.10-rc5. Unlike the original RFC it passes the same number of ndctl test suite (https://github.com/pmem/ndctl) tests as my current development environment does without these patches. I am not intimately familiar with the FS DAX code so would appreciate some careful review there. In particular I have not given any thought at all to CONFIG_FS_DAX_LIMITED. Signed-off-by: Alistair Popple Alistair Popple (13): mm/gup.c: Remove redundant check for PCI P2PDMA page pci/p2pdma: Don't initialise page refcount to one fs/dax: Refactor wait for dax idle page fs/dax: Add dax_page_free callback mm: Allow compound zone device pages mm/memory: Add dax_insert_pfn huge_memory: Allow mappings of PUD sized pages huge_memory: Allow mappings of PMD sized pages gup: Don't allow FOLL_LONGTERM pinning of FS DAX pages fs/dax: Properly refcount fs dax pages huge_memory: Remove dead vmf_insert_pXd code mm: Remove pXX_devmap callers mm: Remove devmap related functions and page table bits Documentation/mm/arch_pgtable_helpers.rst | 6 +- arch/arm64/Kconfig | 1 +- arch/arm64/include/asm/pgtable-prot.h | 1 +- arch/arm64/include/asm/pgtable.h | 24 +-- arch/powerpc/Kconfig | 1 +- arch/powerpc/include/asm/book3s/64/hash-4k.h | 6 +- arch/powerpc/include/asm/book3s/64/hash-64k.h | 7 +- arch/powerpc/include/asm/book3s/64/pgtable.h | 52 +---- arch/powerpc/include/asm/book3s/64/radix.h | 14 +- arch/powerpc/mm/book3s64/hash_pgtable.c | 3 +- arch/powerpc/mm/book3s64/pgtable.c | 8 +- arch/powerpc/mm/book3s64/radix_pgtable.c | 5 +- arch/powerpc/mm/pgtable.c | 2 +- arch/x86/Kconfig | 1 +- arch/x86/include/asm/pgtable.h | 50 +---- arch/x86/include/asm/pgtable_types.h | 5 +- drivers/dax/device.c | 12 +- drivers/dax/super.c | 2 +- drivers/gpu/drm/nouveau/nouveau_dmem.c | 2 +- drivers/nvdimm/pmem.c | 9 +- drivers/pci/p2pdma.c | 4 +- fs/dax.c | 204 +++++++--------- fs/ext4/inode.c | 5 +- fs/fuse/dax.c | 4 +- fs/fuse/virtio_fs.c | 8 +- fs/userfaultfd.c | 2 +- fs/xfs/xfs_inode.c | 4 +- include/linux/dax.h | 11 +- include/linux/huge_mm.h | 17 +- include/linux/memremap.h | 23 +- include/linux/migrate.h | 2 +- include/linux/mm.h | 40 +--- include/linux/page-flags.h | 6 +- include/linux/pfn_t.h | 20 +-- include/linux/pgtable.h | 21 +-- include/linux/rmap.h | 14 +- lib/test_hmm.c | 2 +- mm/Kconfig | 4 +- mm/debug_vm_pgtable.c | 59 +----- mm/gup.c | 178 +-------------- mm/hmm.c | 12 +- mm/huge_memory.c | 248 +++++++------------ mm/internal.h | 2 +- mm/khugepaged.c | 2 +- mm/mapping_dirty_helpers.c | 4 +- mm/memory-failure.c | 6 +- mm/memory.c | 114 ++++++--- mm/memremap.c | 38 +--- mm/migrate_device.c | 6 +- mm/mlock.c | 2 +- mm/mm_init.c | 5 +- mm/mprotect.c | 2 +- mm/mremap.c | 5 +- mm/page_vma_mapped.c | 5 +- mm/pgtable-generic.c | 7 +- mm/rmap.c | 48 ++++- mm/swap.c | 2 +- mm/userfaultfd.c | 2 +- mm/vmscan.c | 5 +- 59 files changed, 485 insertions(+), 869 deletions(-) base-commit: f2661062f16b2de5d7b6a5c42a9a5c96326b8454