From patchwork Tue Sep 24 22:59:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13811346 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A94BCF9C6B for ; Tue, 24 Sep 2024 22:59:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AD42C6B0099; Tue, 24 Sep 2024 18:59:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A36786B009B; Tue, 24 Sep 2024 18:59:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 889296B009C; Tue, 24 Sep 2024 18:59:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 660AD6B0099 for ; Tue, 24 Sep 2024 18:59:15 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 0E89FA05E1 for ; Tue, 24 Sep 2024 22:59:15 +0000 (UTC) X-FDA: 82601149470.03.4358EA6 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) by imf24.hostedemail.com (Postfix) with ESMTP id C1606180004 for ; Tue, 24 Sep 2024 22:59:11 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=IoI49ctE; spf=pass (imf24.hostedemail.com: domain of dan.j.williams@intel.com designates 192.198.163.10 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727218593; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=eYSfGT/iAJPi+mkqSLbbt4TKXl7iZi7zzt1NNIEf4nw=; b=B4i6hFmCEEY+i8NkB7e5+sLtb+T7qP49AgUZpyxwj7Fm535eTgIYXdlFaELdAw1F7CX5pT 7naQ/VK4SXocf5lCoaKIneV+uhC/spjS452XquoNO2zZj9nSU7zdt6rnQioHlxS1BoqdUN WXTswKK29+0uVFC8vSGFX0L1ZjQDX2U= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=IoI49ctE; spf=pass (imf24.hostedemail.com: domain of dan.j.williams@intel.com designates 192.198.163.10 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727218593; a=rsa-sha256; cv=none; b=SaoDnSBvEwiaAVM+iXpPNaS3uPojTB9xbt4rM5A2QdZPNqYwxnAk4BcPLQrRCx2Gw+1GcS CeU6SFgKbyrVEh9/os2DJ6YNeSYKRyrVLYhEPOtwDIWEb/mKX+PXnP2GrvVdyglEYBu2uM pQ82BfXpLJkzPmv3IwhfaLe9nL92nWE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1727218752; x=1758754752; h=subject:from:to:cc:date:message-id:mime-version: content-transfer-encoding; bh=rDgn4gY6sOFBf2KnKyBJWLyViA6CkppFJLUhdfZeX9w=; b=IoI49ctEbdDwJmPLdNefRU+TNGMX4XV2DvrX2ZLs0og/H7B2QVb7HoDy HDddklvA8pXv6gi9S9A9RssiNFvh3jt6fSrgF//q0JKjrjvOUYVZF/FZN bByf4EySWmJxtPBy/fCy36KFkByWv4w2rIvDa9X00pJ0c/nVw/fF+7xvo Qjf2lFyQuwydXxhhRjKWhUWRA6hbx/KHHq4GCDPwNHIlUwHIDhLliVpL8 XGbNUeFooU6PWftg4gsfs5hAeIiJ8Oz175N0AbjCHNMlv5ufWT2TSmnlW my41svNvN82wE5G0hpnPPCf8AjLTYaJAoAmdqti9o3N1Adyfg/zKiZ+D/ Q==; X-CSE-ConnectionGUID: cwGebvSvQKK12L6epAxTdQ== X-CSE-MsgGUID: U/KRNEzlS8O8IQPGyIhLyw== X-IronPort-AV: E=McAfee;i="6700,10204,11205"; a="37599245" X-IronPort-AV: E=Sophos;i="6.10,255,1719903600"; d="scan'208";a="37599245" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Sep 2024 15:59:11 -0700 X-CSE-ConnectionGUID: Oypo7GhSRcOz1k9mjLhguQ== X-CSE-MsgGUID: 1adIioY0SCy0m4wSg98psQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,255,1719903600"; d="scan'208";a="76329045" Received: from ldmartin-desk2.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.125.111.21]) by orviesa005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Sep 2024 15:59:10 -0700 Subject: [PATCH] dcssblk: Mark DAX broken From: Dan Williams To: hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com Cc: Gerald Schaefer , Christian Borntraeger , Sven Schnelle , Jan Kara , Matthew Wilcox , Christoph Hellwig , Alistair Popple , linux-s390@vger.kernel.org, linux-mm@kvack.org Date: Tue, 24 Sep 2024 15:59:08 -0700 Message-ID: <172721874675.497781.3277495908107141898.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: C1606180004 X-Stat-Signature: aaef3o5a3rdgrxkfzmfo59nfgcin9cx7 X-Rspam-User: X-HE-Tag: 1727218751-236078 X-HE-Meta: U2FsdGVkX18SpKz/NM5hwCMQFv4/Yhk/v5RLfImLt7Sf3nzy9bHbZBG4egZlTK69plp+GrY1qY0pi/UCrehgy0eMRAxqb8ZNl97RgQBLEgeWm2gh9efbWxNCPU/rU9yOl6HXraEXKetRCrbmCwZrjny7l3NzpDb9A4+7RQ9DadVQDs1UocNSbuptvHO61cUBLBqXwsyg6TFHbtoQUhOdkqFQoyQ6mDJ4762xpawQ/35GpMgvKULlfhqq5eaci5QVven3u/uVBELsrAsSNpl8sugU7BJvnMqZ0Rk0hfivhexx6887Caz6HTaFiB7zp1OroQ5Hkr0R9iQcZH/J48RfqxK7qiS8P0PJCu6DQBCAxjLXp2DTeIknGRHPnJxB/rNklJ2MwCCVSoKeAZseiTl/Jy30ZtRnC/a1D2GPLOgFIos5e95nk3jQkzt5t5RUssK7XwcHkVnQrQqGfHoBhc4uciatHfP1BRnS7RNauZTCAwzWdsJ5aqHu5VmTyQ1MMbt2NMPEAqZxFHLmC0VtJlelyMCH5QhBjGQaspbqLc0rsI/84T2rri8mWhWeDn0kUdnxt9pZG8rdqzqUr5BdBrq1WPNRwHlmVrzVxj/DcBz6YL0HlIs4pF8dga8ifisbGOxJPqNHd49UPoJgazTBHmaaMXD29N3yzq8771auUhzqaz95wzH8tg4qzx092j1WgM2c8pHvDB81ogHPvwqaWD5NRrWEKSrZ7WF23ESnbWU6gOCb9eKWy6nGjMpY3J8nLh7GxwVlhy9JgRxFCO68iVQqihvSFWka0bSEnnHC7/F3s2le+OLEbvbaVuG2VnraE2UMsnacZT+ilLSzNc0BgNXmbWFRv/kHkP2fZ27PFsxOVwrJM0YLC0V0HuBMwb1UBN9Lu98PyJUwlIco+O7IRK9o4oMaosEQqWf7WMqz6uypwXcxF8PO1u7OvJun+/qP/SdkX9OZ9CwwMhsYUzwri1Q 4zXZDKmJ cB/1G8WCo8NumA/0cvT0yDTRhPhGUgeIqSCm4qaaNsLvTelX8iZM2bxZ+vbGbkRGuWZqkYxKtjAl+pfkBxzLMGJURe0YplAuSsAQk4i9zw/T2gTPpFLdh1REm10XiadH9Mf1jwSLLseFJgA+IM37bIMO9Db4G6FXOywFnxwdMZY9htww6IlIMOQ36/suz7g8206hM5Zq06u4btcMtYtUQxlWblhk3OWNfJuBjKNv6v3vaRaER2IyUrgBMdayr99Pt6VronaftfUrX1xqF8c63eXePWFthxVXZpBvkjHYXhB7BlVF8eUgEbTJYY0jJNy2SXuig X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The dcssblk driver has long needed special case supoprt to enable limited dax operation, so called CONFIG_FS_DAX_LIMITED. This mode works around the incomplete support for ZONE_DEVICE on s390 by forgoing the ability of dax-mapped pages to support GUP. Now, pending cleanups to fsdax that fix its reference counting [1] depend on the ability of all dax drivers to supply ZONE_DEVICE pages. To allow that work to move forward, dax support needs to be paused for dcssblk until ZONE_DEVICE support arrives. That work has been known for a few years [2], and the removal of "pte_devmap" requirements [3] makes the conversion easier. For now, place the support behind CONFIG_BROKEN, and remove PFN_SPECIAL (dcssblk was the only user). Link: http://lore.kernel.org/cover.9f0e45d52f5cff58807831b6b867084d0b14b61c.1725941415.git-series.apopple@nvidia.com [1] Link: http://lore.kernel.org/20210820210318.187742e8@thinkpad/ [2] Link: http://lore.kernel.org/4511465a4f8429f45e2ac70d2e65dc5e1df1eb47.1725941415.git-series.apopple@nvidia.com [3] Cc: Gerald Schaefer Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Alexander Gordeev Cc: Christian Borntraeger Cc: Sven Schnelle Cc: Jan Kara Cc: Matthew Wilcox Cc: Christoph Hellwig Cc: Alistair Popple Signed-off-by: Dan Williams Signed-off-by: Dan Williams Tested-by: Alexander Gordeev Acked-by: David Hildenbrand Reviewed-by: Gerald Schaefer --- drivers/s390/block/Kconfig | 12 ++++++++++-- drivers/s390/block/dcssblk.c | 26 +++++++++++++++++--------- fs/Kconfig | 9 +-------- fs/dax.c | 12 ------------ include/linux/pfn_t.h | 15 --------------- mm/memory.c | 2 -- mm/memremap.c | 4 ---- 7 files changed, 28 insertions(+), 52 deletions(-) diff --git a/drivers/s390/block/Kconfig b/drivers/s390/block/Kconfig index e3710a762aba..4bfe469c04aa 100644 --- a/drivers/s390/block/Kconfig +++ b/drivers/s390/block/Kconfig @@ -4,13 +4,21 @@ comment "S/390 block device drivers" config DCSSBLK def_tristate m - select FS_DAX_LIMITED - select DAX prompt "DCSSBLK support" depends on S390 && BLOCK help Support for dcss block device +config DCSSBLK_DAX + def_bool y + depends on DCSSBLK + # requires S390 ZONE_DEVICE support + depends on BROKEN + select DAX + prompt "DCSSBLK DAX support" + help + Enable DAX operation for the dcss block device + config DASD def_tristate y prompt "Support for DASD devices" diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c index 02a4a51da1b7..d1bc79cf56bd 100644 --- a/drivers/s390/block/dcssblk.c +++ b/drivers/s390/block/dcssblk.c @@ -540,6 +540,21 @@ static const struct attribute_group *dcssblk_dev_attr_groups[] = { NULL, }; +static int dcssblk_setup_dax(struct dcssblk_dev_info *dev_info) +{ + struct dax_device *dax_dev; + + if (!IS_ENABLED(CONFIG_DCSSBLK_DAX)) + return 0; + + dax_dev = alloc_dax(dev_info, &dcssblk_dax_ops); + if (IS_ERR(dax_dev)) + return PTR_ERR(dax_dev); + set_dax_synchronous(dax_dev); + dev_info->dax_dev = dax_dev; + return dax_add_host(dev_info->dax_dev, dev_info->gd); +} + /* * device attribute for adding devices */ @@ -680,14 +695,7 @@ dcssblk_add_store(struct device *dev, struct device_attribute *attr, const char if (rc) goto put_dev; - dax_dev = alloc_dax(dev_info, &dcssblk_dax_ops); - if (IS_ERR(dax_dev)) { - rc = PTR_ERR(dax_dev); - goto put_dev; - } - set_dax_synchronous(dax_dev); - dev_info->dax_dev = dax_dev; - rc = dax_add_host(dev_info->dax_dev, dev_info->gd); + rc = dcssblk_setup_dax(dev_info); if (rc) goto out_dax; @@ -923,7 +931,7 @@ __dcssblk_direct_access(struct dcssblk_dev_info *dev_info, pgoff_t pgoff, *kaddr = __va(dev_info->start + offset); if (pfn) *pfn = __pfn_to_pfn_t(PFN_DOWN(dev_info->start + offset), - PFN_DEV|PFN_SPECIAL); + PFN_DEV); return (dev_sz - offset) / PAGE_SIZE; } diff --git a/fs/Kconfig b/fs/Kconfig index 0e4efec1d92e..a6f4f28fa09e 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -60,7 +60,7 @@ endif # BLOCK config FS_DAX bool "File system based Direct Access (DAX) support" depends on MMU - depends on ZONE_DEVICE || FS_DAX_LIMITED + depends on ZONE_DEVICE select FS_IOMAP select DAX help @@ -96,13 +96,6 @@ config FS_DAX_PMD depends on ZONE_DEVICE depends on TRANSPARENT_HUGEPAGE -# Selected by DAX drivers that do not expect filesystem DAX to support -# get_user_pages() of DAX mappings. I.e. "limited" indicates no support -# for fork() of processes with MAP_SHARED mappings or support for -# direct-I/O to a DAX mapping. -config FS_DAX_LIMITED - bool - # Posix ACL utility routines # # Note: Posix ACLs can be implemented without these helpers. Never use diff --git a/fs/dax.c b/fs/dax.c index becb4a6920c6..6257d3fdf8f8 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -359,9 +359,6 @@ static void dax_associate_entry(void *entry, struct address_space *mapping, unsigned long size = dax_entry_size(entry), pfn, index; int i = 0; - if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) - return; - index = linear_page_index(vma, address & ~(size - 1)); for_each_mapped_pfn(entry, pfn) { struct page *page = pfn_to_page(pfn); @@ -381,9 +378,6 @@ static void dax_disassociate_entry(void *entry, struct address_space *mapping, { unsigned long pfn; - if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) - return; - for_each_mapped_pfn(entry, pfn) { struct page *page = pfn_to_page(pfn); @@ -684,12 +678,6 @@ struct page *dax_layout_busy_page_range(struct address_space *mapping, pgoff_t end_idx; XA_STATE(xas, &mapping->i_pages, start_idx); - /* - * In the 'limited' case get_user_pages() for dax is disabled. - */ - if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) - return NULL; - if (!dax_mapping(mapping) || !mapping_mapped(mapping)) return NULL; diff --git a/include/linux/pfn_t.h b/include/linux/pfn_t.h index 2d9148221e9a..eb8da94d1d19 100644 --- a/include/linux/pfn_t.h +++ b/include/linux/pfn_t.h @@ -9,18 +9,14 @@ * PFN_SG_LAST - pfn references a page and is the last scatterlist entry * PFN_DEV - pfn is not covered by system memmap by default * PFN_MAP - pfn has a dynamic page mapping established by a device driver - * PFN_SPECIAL - for CONFIG_FS_DAX_LIMITED builds to allow XIP, but not - * get_user_pages */ #define PFN_FLAGS_MASK (((u64) (~PAGE_MASK)) << (BITS_PER_LONG_LONG - PAGE_SHIFT)) #define PFN_SG_CHAIN (1ULL << (BITS_PER_LONG_LONG - 1)) #define PFN_SG_LAST (1ULL << (BITS_PER_LONG_LONG - 2)) #define PFN_DEV (1ULL << (BITS_PER_LONG_LONG - 3)) #define PFN_MAP (1ULL << (BITS_PER_LONG_LONG - 4)) -#define PFN_SPECIAL (1ULL << (BITS_PER_LONG_LONG - 5)) #define PFN_FLAGS_TRACE \ - { PFN_SPECIAL, "SPECIAL" }, \ { PFN_SG_CHAIN, "SG_CHAIN" }, \ { PFN_SG_LAST, "SG_LAST" }, \ { PFN_DEV, "DEV" }, \ @@ -117,15 +113,4 @@ pud_t pud_mkdevmap(pud_t pud); #endif #endif /* CONFIG_ARCH_HAS_PTE_DEVMAP */ -#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL -static inline bool pfn_t_special(pfn_t pfn) -{ - return (pfn.val & PFN_SPECIAL) == PFN_SPECIAL; -} -#else -static inline bool pfn_t_special(pfn_t pfn) -{ - return false; -} -#endif /* CONFIG_ARCH_HAS_PTE_SPECIAL */ #endif /* _LINUX_PFN_T_H_ */ diff --git a/mm/memory.c b/mm/memory.c index c31ea300cdf6..676f5cda992a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2462,8 +2462,6 @@ static bool vm_mixed_ok(struct vm_area_struct *vma, pfn_t pfn, bool mkwrite) return true; if (pfn_t_devmap(pfn)) return true; - if (pfn_t_special(pfn)) - return true; if (is_zero_pfn(pfn_t_to_pfn(pfn))) return true; return false; diff --git a/mm/memremap.c b/mm/memremap.c index 40d4547ce514..a6bbbe180eab 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -332,10 +332,6 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid) } break; case MEMORY_DEVICE_FS_DAX: - if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) { - WARN(1, "File system DAX not supported\n"); - return ERR_PTR(-EINVAL); - } params.pgprot = pgprot_decrypted(params.pgprot); break; case MEMORY_DEVICE_GENERIC: