diff mbox series

dcssblk: Mark DAX broken

Message ID 172721874675.497781.3277495908107141898.stgit@dwillia2-xfh.jf.intel.com (mailing list archive)
State New
Headers show
Series dcssblk: Mark DAX broken | expand

Commit Message

Dan Williams Sept. 24, 2024, 10:59 p.m. UTC
The dcssblk driver has long needed special case supoprt to enable
limited dax operation, so called CONFIG_FS_DAX_LIMITED. This mode
works around the incomplete support for ZONE_DEVICE on s390 by forgoing
the ability of dax-mapped pages to support GUP.

Now, pending cleanups to fsdax that fix its reference counting [1] depend on
the ability of all dax drivers to supply ZONE_DEVICE pages.

To allow that work to move forward, dax support needs to be paused for
dcssblk until ZONE_DEVICE support arrives. That work has been known for
a few years [2], and the removal of "pte_devmap" requirements [3] makes the
conversion easier.

For now, place the support behind CONFIG_BROKEN, and remove PFN_SPECIAL
(dcssblk was the only user).

Link: http://lore.kernel.org/cover.9f0e45d52f5cff58807831b6b867084d0b14b61c.1725941415.git-series.apopple@nvidia.com [1]
Link: http://lore.kernel.org/20210820210318.187742e8@thinkpad/ [2]
Link: http://lore.kernel.org/4511465a4f8429f45e2ac70d2e65dc5e1df1eb47.1725941415.git-series.apopple@nvidia.com [3]
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/s390/block/Kconfig   |   12 ++++++++++--
 drivers/s390/block/dcssblk.c |   26 +++++++++++++++++---------
 fs/Kconfig                   |    9 +--------
 fs/dax.c                     |   12 ------------
 include/linux/pfn_t.h        |   15 ---------------
 mm/memory.c                  |    2 --
 mm/memremap.c                |    4 ----
 7 files changed, 28 insertions(+), 52 deletions(-)

Comments

Dan Williams Sept. 24, 2024, 11:26 p.m. UTC | #1
Dan Williams wrote:
> The dcssblk driver has long needed special case supoprt to enable
> limited dax operation, so called CONFIG_FS_DAX_LIMITED. This mode
> works around the incomplete support for ZONE_DEVICE on s390 by forgoing
> the ability of dax-mapped pages to support GUP.
> 
> Now, pending cleanups to fsdax that fix its reference counting [1] depend on
> the ability of all dax drivers to supply ZONE_DEVICE pages.
> 
> To allow that work to move forward, dax support needs to be paused for
> dcssblk until ZONE_DEVICE support arrives. That work has been known for
> a few years [2], and the removal of "pte_devmap" requirements [3] makes the
> conversion easier.
> 
> For now, place the support behind CONFIG_BROKEN, and remove PFN_SPECIAL
> (dcssblk was the only user).
> 
> Link: http://lore.kernel.org/cover.9f0e45d52f5cff58807831b6b867084d0b14b61c.1725941415.git-series.apopple@nvidia.com [1]
> Link: http://lore.kernel.org/20210820210318.187742e8@thinkpad/ [2]
> Link: http://lore.kernel.org/4511465a4f8429f45e2ac70d2e65dc5e1df1eb47.1725941415.git-series.apopple@nvidia.com [3]
> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
> Cc: Heiko Carstens <hca@linux.ibm.com>
> Cc: Vasily Gorbik <gor@linux.ibm.com>
> Cc: Alexander Gordeev <agordeev@linux.ibm.com>
> Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
> Cc: Sven Schnelle <svens@linux.ibm.com>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Alistair Popple <apopple@nvidia.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/s390/block/Kconfig   |   12 ++++++++++--
>  drivers/s390/block/dcssblk.c |   26 +++++++++++++++++---------
>  fs/Kconfig                   |    9 +--------
>  fs/dax.c                     |   12 ------------
>  include/linux/pfn_t.h        |   15 ---------------
>  mm/memory.c                  |    2 --
>  mm/memremap.c                |    4 ----
>  7 files changed, 28 insertions(+), 52 deletions(-)

As additional motivation, with this addressed, pfn_t can also be removed
for "moar red-diff!":  44 files changed, 141 insertions(+), 301 deletions(-)

Patch below is on top of Alistair's series. It will need to be rebased
on top of the final version of that, but here it is for demonstration
purposes.

-- >8 --
Subject: mm: Remove pfn_t

From: Dan Williams <dan.j.williams@intel.com>

The pfn_t type was created to convey mapping constraints from
->direct_acces() methods to core mm helpers like vmf_insert_mixed(). Now
that all ->direct_access() helpers return ZONE_DEVICE pages, and
ZONE_DEVICE pages no longer require pte_devmap, there is no longer a
need for pfn_t.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/x86/mm/pat/memtype.c                |    5 +-
 drivers/dax/device.c                     |   19 +++---
 drivers/dax/hmem/hmem.c                  |    1 
 drivers/dax/kmem.c                       |    1 
 drivers/dax/pmem.c                       |    1 
 drivers/dax/pmem/pmem.c                  |    1 
 drivers/dax/super.c                      |    3 -
 drivers/gpu/drm/exynos/exynos_drm_gem.c  |    1 
 drivers/gpu/drm/gma500/fbdev.c           |    3 -
 drivers/gpu/drm/i915/gem/i915_gem_mman.c |    1 
 drivers/gpu/drm/msm/msm_gem.c            |    1 
 drivers/gpu/drm/omapdrm/omap_gem.c       |    7 +-
 drivers/gpu/drm/v3d/v3d_bo.c             |    1 
 drivers/md/dm-linear.c                   |    4 +
 drivers/md/dm-log-writes.c               |    5 +-
 drivers/md/dm-stripe.c                   |    4 +
 drivers/md/dm-target.c                   |    4 +
 drivers/md/dm-writecache.c               |   16 +----
 drivers/md/dm.c                          |    4 +
 drivers/nvdimm/pmem.c                    |   15 ++---
 drivers/nvdimm/pmem.h                    |    6 +-
 drivers/s390/block/dcssblk.c             |   21 +++----
 fs/cramfs/inode.c                        |    4 +
 fs/dax.c                                 |   53 +++++++++--------
 fs/ext4/file.c                           |    2 -
 fs/fuse/dax.c                            |    3 -
 fs/fuse/virtio_fs.c                      |    5 +-
 fs/xfs/xfs_file.c                        |    2 -
 include/linux/dax.h                      |   12 ++--
 include/linux/device-mapper.h            |    7 +-
 include/linux/huge_mm.h                  |    8 +--
 include/linux/mm.h                       |    7 +-
 include/linux/pfn.h                      |   13 ----
 include/linux/pfn_t.h                    |   96 ------------------------------
 include/linux/pgtable.h                  |    4 +
 include/trace/events/fs_dax.h            |   14 ++--
 mm/debug_vm_pgtable.c                    |    1 
 mm/huge_memory.c                         |   27 ++++----
 mm/memory.c                              |   38 +++++-------
 mm/memremap.c                            |    1 
 mm/migrate.c                             |    1 
 tools/testing/nvdimm/pmem-dax.c          |    8 +--
 tools/testing/nvdimm/test/iomap.c        |   11 ---
 tools/testing/nvdimm/test/nfit_test.h    |    1 
 44 files changed, 141 insertions(+), 301 deletions(-)
 delete mode 100644 include/linux/pfn_t.h

diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c
index eb84593cf95c..da57ccb2da34 100644
--- a/arch/x86/mm/pat/memtype.c
+++ b/arch/x86/mm/pat/memtype.c
@@ -36,7 +36,6 @@
 #include <linux/debugfs.h>
 #include <linux/ioport.h>
 #include <linux/kernel.h>
-#include <linux/pfn_t.h>
 #include <linux/slab.h>
 #include <linux/mm.h>
 #include <linux/highmem.h>
@@ -1074,7 +1073,7 @@ int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
 	return 0;
 }
 
-void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot, pfn_t pfn)
+void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot, unsigned long pfn)
 {
 	enum page_cache_mode pcm;
 
@@ -1082,7 +1081,7 @@ void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot, pfn_t pfn)
 		return;
 
 	/* Set prot based on lookup */
-	pcm = lookup_memtype(pfn_t_to_phys(pfn));
+	pcm = lookup_memtype(PFN_PHYS(pfn));
 	*prot = __pgprot((pgprot_val(*prot) & (~_PAGE_CACHE_MASK)) |
 			 cachemode2protval(pcm));
 }
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index 4d3ddd128790..aae90a5bcd30 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -4,7 +4,6 @@
 #include <linux/pagemap.h>
 #include <linux/module.h>
 #include <linux/device.h>
-#include <linux/pfn_t.h>
 #include <linux/cdev.h>
 #include <linux/slab.h>
 #include <linux/dax.h>
@@ -73,8 +72,8 @@ __weak phys_addr_t dax_pgoff_to_phys(struct dev_dax *dev_dax, pgoff_t pgoff,
 	return -1;
 }
 
-static void dax_set_mapping(struct vm_fault *vmf, pfn_t pfn,
-			      unsigned long fault_size)
+static void dax_set_mapping(struct vm_fault *vmf, unsigned long pfn,
+			    unsigned long fault_size)
 {
 	unsigned long i, nr_pages = fault_size / PAGE_SIZE;
 	struct file *filp = vmf->vma->vm_file;
@@ -89,7 +88,7 @@ static void dax_set_mapping(struct vm_fault *vmf, pfn_t pfn,
 			ALIGN(vmf->address, fault_size));
 
 	for (i = 0; i < nr_pages; i++) {
-		struct page *page = pfn_to_page(pfn_t_to_pfn(pfn) + i);
+		struct page *page = pfn_to_page(pfn + i);
 
 		page = compound_head(page);
 		if (page->mapping)
@@ -105,7 +104,7 @@ static vm_fault_t __dev_dax_pte_fault(struct dev_dax *dev_dax,
 {
 	struct device *dev = &dev_dax->dev;
 	phys_addr_t phys;
-	pfn_t pfn;
+	unsigned long pfn;
 	unsigned int fault_size = PAGE_SIZE;
 
 	if (check_vma(dev_dax, vmf->vma, __func__))
@@ -126,7 +125,7 @@ static vm_fault_t __dev_dax_pte_fault(struct dev_dax *dev_dax,
 		return VM_FAULT_SIGBUS;
 	}
 
-	pfn = phys_to_pfn_t(phys, 0);
+	pfn = PHYS_PFN(phys);
 
 	dax_set_mapping(vmf, pfn, fault_size);
 
@@ -140,7 +139,7 @@ static vm_fault_t __dev_dax_pmd_fault(struct dev_dax *dev_dax,
 	struct device *dev = &dev_dax->dev;
 	phys_addr_t phys;
 	pgoff_t pgoff;
-	pfn_t pfn;
+	unsigned long pfn;
 	unsigned int fault_size = PMD_SIZE;
 
 	if (check_vma(dev_dax, vmf->vma, __func__))
@@ -169,7 +168,7 @@ static vm_fault_t __dev_dax_pmd_fault(struct dev_dax *dev_dax,
 		return VM_FAULT_SIGBUS;
 	}
 
-	pfn = phys_to_pfn_t(phys, 0);
+	pfn = PHYS_PFN(phys);
 
 	dax_set_mapping(vmf, pfn, fault_size);
 
@@ -184,7 +183,7 @@ static vm_fault_t __dev_dax_pud_fault(struct dev_dax *dev_dax,
 	struct device *dev = &dev_dax->dev;
 	phys_addr_t phys;
 	pgoff_t pgoff;
-	pfn_t pfn;
+	unsigned long pfn;
 	unsigned int fault_size = PUD_SIZE;
 
 
@@ -214,7 +213,7 @@ static vm_fault_t __dev_dax_pud_fault(struct dev_dax *dev_dax,
 		return VM_FAULT_SIGBUS;
 	}
 
-	pfn = phys_to_pfn_t(phys, 0);
+	pfn = PHYS_PFN(phys);
 
 	dax_set_mapping(vmf, pfn, fault_size);
 
diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
index 5e7c53f18491..c18451a37e4f 100644
--- a/drivers/dax/hmem/hmem.c
+++ b/drivers/dax/hmem/hmem.c
@@ -2,7 +2,6 @@
 #include <linux/platform_device.h>
 #include <linux/memregion.h>
 #include <linux/module.h>
-#include <linux/pfn_t.h>
 #include <linux/dax.h>
 #include "../bus.h"
 
diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index e97d47f42ee2..87b5321675ff 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -5,7 +5,6 @@
 #include <linux/memory.h>
 #include <linux/module.h>
 #include <linux/device.h>
-#include <linux/pfn_t.h>
 #include <linux/slab.h>
 #include <linux/dax.h>
 #include <linux/fs.h>
diff --git a/drivers/dax/pmem.c b/drivers/dax/pmem.c
index c8ebf4e281f2..bee93066a849 100644
--- a/drivers/dax/pmem.c
+++ b/drivers/dax/pmem.c
@@ -2,7 +2,6 @@
 /* Copyright(c) 2016 - 2018 Intel Corporation. All rights reserved. */
 #include <linux/memremap.h>
 #include <linux/module.h>
-#include <linux/pfn_t.h>
 #include "../nvdimm/pfn.h"
 #include "../nvdimm/nd.h"
 #include "bus.h"
diff --git a/drivers/dax/pmem/pmem.c b/drivers/dax/pmem/pmem.c
index dfe91a2990fe..ce3394617d15 100644
--- a/drivers/dax/pmem/pmem.c
+++ b/drivers/dax/pmem/pmem.c
@@ -3,7 +3,6 @@
 #include <linux/percpu-refcount.h>
 #include <linux/memremap.h>
 #include <linux/module.h>
-#include <linux/pfn_t.h>
 #include <linux/nd.h>
 #include "../bus.h"
 
diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index 57a94a6c00e5..3706d803acbf 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -7,7 +7,6 @@
 #include <linux/mount.h>
 #include <linux/pseudo_fs.h>
 #include <linux/magic.h>
-#include <linux/pfn_t.h>
 #include <linux/cdev.h>
 #include <linux/slab.h>
 #include <linux/uio.h>
@@ -148,7 +147,7 @@ enum dax_device_flags {
  * pages accessible at the device relative @pgoff.
  */
 long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages,
-		enum dax_access_mode mode, void **kaddr, pfn_t *pfn)
+		enum dax_access_mode mode, void **kaddr, unsigned long *pfn)
 {
 	long avail;
 
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c
index 638ca96830e9..ab8d6cea09f5 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
@@ -7,7 +7,6 @@
 
 
 #include <linux/dma-buf.h>
-#include <linux/pfn_t.h>
 #include <linux/shmem_fs.h>
 #include <linux/module.h>
 
diff --git a/drivers/gpu/drm/gma500/fbdev.c b/drivers/gpu/drm/gma500/fbdev.c
index 98b44974d42d..997c9038db38 100644
--- a/drivers/gpu/drm/gma500/fbdev.c
+++ b/drivers/gpu/drm/gma500/fbdev.c
@@ -6,7 +6,6 @@
  **************************************************************************/
 
 #include <linux/fb.h>
-#include <linux/pfn_t.h>
 
 #include <drm/drm_crtc_helper.h>
 #include <drm/drm_drv.h>
@@ -33,7 +32,7 @@ static vm_fault_t psb_fbdev_vm_fault(struct vm_fault *vmf)
 	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 
 	for (i = 0; i < page_num; ++i) {
-		err = vmf_insert_mixed(vma, address, __pfn_to_pfn_t(pfn, PFN_DEV));
+		err = vmf_insert_mixed(vma, address, pfn);
 		if (unlikely(err & VM_FAULT_ERROR))
 			break;
 		address += PAGE_SIZE;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index cac6d4184506..4faab805909d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -6,7 +6,6 @@
 
 #include <linux/anon_inodes.h>
 #include <linux/mman.h>
-#include <linux/pfn_t.h>
 #include <linux/sizes.h>
 
 #include <drm/drm_cache.h>
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index ebc9ba66efb8..1c275008b223 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -9,7 +9,6 @@
 #include <linux/spinlock.h>
 #include <linux/shmem_fs.h>
 #include <linux/dma-buf.h>
-#include <linux/pfn_t.h>
 
 #include <drm/drm_prime.h>
 #include <drm/drm_file.h>
diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c b/drivers/gpu/drm/omapdrm/omap_gem.c
index fdae677558f3..5523196f5b28 100644
--- a/drivers/gpu/drm/omapdrm/omap_gem.c
+++ b/drivers/gpu/drm/omapdrm/omap_gem.c
@@ -8,7 +8,6 @@
 #include <linux/seq_file.h>
 #include <linux/shmem_fs.h>
 #include <linux/spinlock.h>
-#include <linux/pfn_t.h>
 #include <linux/vmalloc.h>
 
 #include <drm/drm_prime.h>
@@ -371,8 +370,7 @@ static vm_fault_t omap_gem_fault_1d(struct drm_gem_object *obj,
 	VERB("Inserting %p pfn %lx, pa %lx", (void *)vmf->address,
 			pfn, pfn << PAGE_SHIFT);
 
-	return vmf_insert_mixed(vma, vmf->address,
-			__pfn_to_pfn_t(pfn, PFN_DEV));
+	return vmf_insert_mixed(vma, vmf->address, pfn);
 }
 
 /* Special handling for the case of faulting in 2d tiled buffers */
@@ -467,8 +465,7 @@ static vm_fault_t omap_gem_fault_2d(struct drm_gem_object *obj,
 			pfn, pfn << PAGE_SHIFT);
 
 	for (i = n; i > 0; i--) {
-		ret = vmf_insert_mixed(vma,
-			vaddr, __pfn_to_pfn_t(pfn, PFN_DEV));
+		ret = vmf_insert_mixed(vma, vaddr, pfn);
 		if (ret & VM_FAULT_ERROR)
 			break;
 		pfn += priv->usergart[fmt].stride_pfn;
diff --git a/drivers/gpu/drm/v3d/v3d_bo.c b/drivers/gpu/drm/v3d/v3d_bo.c
index a165cbcdd27b..091bc758b23a 100644
--- a/drivers/gpu/drm/v3d/v3d_bo.c
+++ b/drivers/gpu/drm/v3d/v3d_bo.c
@@ -20,7 +20,6 @@
  */
 
 #include <linux/dma-buf.h>
-#include <linux/pfn_t.h>
 #include <linux/vmalloc.h>
 
 #include "v3d_drv.h"
diff --git a/drivers/md/dm-linear.c b/drivers/md/dm-linear.c
index 49fb0f684193..211528d1eebf 100644
--- a/drivers/md/dm-linear.c
+++ b/drivers/md/dm-linear.c
@@ -167,8 +167,8 @@ static struct dax_device *linear_dax_pgoff(struct dm_target *ti, pgoff_t *pgoff)
 }
 
 static long linear_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
-		long nr_pages, enum dax_access_mode mode, void **kaddr,
-		pfn_t *pfn)
+				     long nr_pages, enum dax_access_mode mode,
+				     void **kaddr, unsigned long *pfn)
 {
 	struct dax_device *dax_dev = linear_dax_pgoff(ti, &pgoff);
 
diff --git a/drivers/md/dm-log-writes.c b/drivers/md/dm-log-writes.c
index 8d7df8303d0a..63037f0cd277 100644
--- a/drivers/md/dm-log-writes.c
+++ b/drivers/md/dm-log-writes.c
@@ -890,8 +890,9 @@ static struct dax_device *log_writes_dax_pgoff(struct dm_target *ti,
 }
 
 static long log_writes_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
-		long nr_pages, enum dax_access_mode mode, void **kaddr,
-		pfn_t *pfn)
+					 long nr_pages,
+					 enum dax_access_mode mode,
+					 void **kaddr, unsigned long *pfn)
 {
 	struct dax_device *dax_dev = log_writes_dax_pgoff(ti, &pgoff);
 
diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
index 4112071de0be..b13c43d716f1 100644
--- a/drivers/md/dm-stripe.c
+++ b/drivers/md/dm-stripe.c
@@ -315,8 +315,8 @@ static struct dax_device *stripe_dax_pgoff(struct dm_target *ti, pgoff_t *pgoff)
 }
 
 static long stripe_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
-		long nr_pages, enum dax_access_mode mode, void **kaddr,
-		pfn_t *pfn)
+				     long nr_pages, enum dax_access_mode mode,
+				     void **kaddr, unsigned long *pfn)
 {
 	struct dax_device *dax_dev = stripe_dax_pgoff(ti, &pgoff);
 
diff --git a/drivers/md/dm-target.c b/drivers/md/dm-target.c
index 652627aea11b..6dfb6d680f2c 100644
--- a/drivers/md/dm-target.c
+++ b/drivers/md/dm-target.c
@@ -254,8 +254,8 @@ static void io_err_io_hints(struct dm_target *ti, struct queue_limits *limits)
 }
 
 static long io_err_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
-		long nr_pages, enum dax_access_mode mode, void **kaddr,
-		pfn_t *pfn)
+				     long nr_pages, enum dax_access_mode mode,
+				     void **kaddr, unsigned long *pfn)
 {
 	return -EIO;
 }
diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
index 7ce8847b3404..2c841e30ae92 100644
--- a/drivers/md/dm-writecache.c
+++ b/drivers/md/dm-writecache.c
@@ -13,7 +13,6 @@
 #include <linux/dm-io.h>
 #include <linux/dm-kcopyd.h>
 #include <linux/dax.h>
-#include <linux/pfn_t.h>
 #include <linux/libnvdimm.h>
 #include <linux/delay.h>
 #include "dm-io-tracker.h"
@@ -256,7 +255,7 @@ static int persistent_memory_claim(struct dm_writecache *wc)
 	int r;
 	loff_t s;
 	long p, da;
-	pfn_t pfn;
+	unsigned long pfn;
 	int id;
 	struct page **pages;
 	sector_t offset;
@@ -290,11 +289,6 @@ static int persistent_memory_claim(struct dm_writecache *wc)
 		r = da;
 		goto err2;
 	}
-	if (!pfn_t_has_page(pfn)) {
-		wc->memory_map = NULL;
-		r = -EOPNOTSUPP;
-		goto err2;
-	}
 	if (da != p) {
 		long i;
 
@@ -314,13 +308,9 @@ static int persistent_memory_claim(struct dm_writecache *wc)
 				r = daa ? daa : -EINVAL;
 				goto err3;
 			}
-			if (!pfn_t_has_page(pfn)) {
-				r = -EOPNOTSUPP;
-				goto err3;
-			}
 			while (daa-- && i < p) {
-				pages[i++] = pfn_t_to_page(pfn);
-				pfn.val++;
+				pages[i++] = pfn_to_page(pfn);
+				pfn++;
 				if (!(i & 15))
 					cond_resched();
 			}
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 87bb90303435..d24324c49433 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1231,8 +1231,8 @@ static struct dm_target *dm_dax_get_live_target(struct mapped_device *md,
 }
 
 static long dm_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
-		long nr_pages, enum dax_access_mode mode, void **kaddr,
-		pfn_t *pfn)
+				 long nr_pages, enum dax_access_mode mode,
+				 void **kaddr, unsigned long *pfn)
 {
 	struct mapped_device *md = dax_get_private(dax_dev);
 	sector_t sector = pgoff * PAGE_SECTORS;
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 451cd0fa0c94..d3b3febc8124 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -20,7 +20,6 @@
 #include <linux/kstrtox.h>
 #include <linux/vmalloc.h>
 #include <linux/blk-mq.h>
-#include <linux/pfn_t.h>
 #include <linux/slab.h>
 #include <linux/uio.h>
 #include <linux/dax.h>
@@ -242,7 +241,7 @@ static void pmem_submit_bio(struct bio *bio)
 /* see "strong" declaration in tools/testing/nvdimm/pmem-dax.c */
 __weak long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
 		long nr_pages, enum dax_access_mode mode, void **kaddr,
-		pfn_t *pfn)
+		unsigned long *pfn)
 {
 	resource_size_t offset = PFN_PHYS(pgoff) + pmem->data_offset;
 	sector_t sector = PFN_PHYS(pgoff) >> SECTOR_SHIFT;
@@ -254,7 +253,7 @@ __weak long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
 	if (kaddr)
 		*kaddr = pmem->virt_addr + offset;
 	if (pfn)
-		*pfn = phys_to_pfn_t(pmem->phys_addr + offset, pmem->pfn_flags);
+		*pfn = PHYS_PFN(pmem->phys_addr + offset);
 
 	if (bb->count &&
 	    badblocks_check(bb, sector, num, &first_bad, &num_bad)) {
@@ -301,9 +300,9 @@ static int pmem_dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff,
 				   PAGE_SIZE));
 }
 
-static long pmem_dax_direct_access(struct dax_device *dax_dev,
-		pgoff_t pgoff, long nr_pages, enum dax_access_mode mode,
-		void **kaddr, pfn_t *pfn)
+static long pmem_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
+				   long nr_pages, enum dax_access_mode mode,
+				   void **kaddr, unsigned long *pfn)
 {
 	struct pmem_device *pmem = dax_get_private(dax_dev);
 
@@ -432,7 +431,8 @@ static void pmem_release_disk(void *__pmem)
 }
 
 static int pmem_pagemap_memory_failure(struct dev_pagemap *pgmap,
-		unsigned long pfn, unsigned long nr_pages, int mf_flags)
+				       unsigned long pfn,
+				       unsigned long nr_pages, int mf_flags)
 {
 	struct pmem_device *pmem =
 			container_of(pgmap, struct pmem_device, pgmap);
@@ -513,7 +513,6 @@ static int pmem_attach_disk(struct device *dev,
 
 	pmem->disk = disk;
 	pmem->pgmap.owner = pmem;
-	pmem->pfn_flags = 0;
 	if (is_nd_pfn(dev)) {
 		pmem->pgmap.type = MEMORY_DEVICE_FS_DAX;
 		pmem->pgmap.ops = &fsdax_pagemap_ops;
diff --git a/drivers/nvdimm/pmem.h b/drivers/nvdimm/pmem.h
index 392b0b38acb9..99ce3ac51fdd 100644
--- a/drivers/nvdimm/pmem.h
+++ b/drivers/nvdimm/pmem.h
@@ -5,7 +5,6 @@
 #include <linux/badblocks.h>
 #include <linux/memremap.h>
 #include <linux/types.h>
-#include <linux/pfn_t.h>
 #include <linux/fs.h>
 
 enum dax_access_mode;
@@ -16,7 +15,6 @@ struct pmem_device {
 	phys_addr_t		phys_addr;
 	/* when non-zero this device is hosting a 'pfn' instance */
 	phys_addr_t		data_offset;
-	u64			pfn_flags;
 	void			*virt_addr;
 	/* immutable base size of the namespace */
 	size_t			size;
@@ -30,8 +28,8 @@ struct pmem_device {
 };
 
 long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
-		long nr_pages, enum dax_access_mode mode, void **kaddr,
-		pfn_t *pfn);
+			  long nr_pages, enum dax_access_mode mode,
+			  void **kaddr, unsigned long *pfn);
 
 #ifdef CONFIG_MEMORY_FAILURE
 static inline bool test_and_clear_pmem_poison(struct page *page)
diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c
index d1bc79cf56bd..9b537020fe25 100644
--- a/drivers/s390/block/dcssblk.c
+++ b/drivers/s390/block/dcssblk.c
@@ -17,7 +17,6 @@
 #include <linux/blkdev.h>
 #include <linux/completion.h>
 #include <linux/interrupt.h>
-#include <linux/pfn_t.h>
 #include <linux/uio.h>
 #include <linux/dax.h>
 #include <linux/io.h>
@@ -32,8 +31,8 @@ static int dcssblk_open(struct gendisk *disk, blk_mode_t mode);
 static void dcssblk_release(struct gendisk *disk);
 static void dcssblk_submit_bio(struct bio *bio);
 static long dcssblk_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
-		long nr_pages, enum dax_access_mode mode, void **kaddr,
-		pfn_t *pfn);
+				      long nr_pages, enum dax_access_mode mode,
+				      void **kaddr, unsigned long *pfn);
 
 static char dcssblk_segments[DCSSBLK_PARM_LEN] = "\0";
 
@@ -919,9 +918,9 @@ dcssblk_submit_bio(struct bio *bio)
 	bio_io_error(bio);
 }
 
-static long
-__dcssblk_direct_access(struct dcssblk_dev_info *dev_info, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn)
+static long __dcssblk_direct_access(struct dcssblk_dev_info *dev_info,
+				    pgoff_t pgoff, long nr_pages, void **kaddr,
+				    unsigned long *pfn)
 {
 	resource_size_t offset = pgoff * PAGE_SIZE;
 	unsigned long dev_sz;
@@ -930,16 +929,14 @@ __dcssblk_direct_access(struct dcssblk_dev_info *dev_info, pgoff_t pgoff,
 	if (kaddr)
 		*kaddr = __va(dev_info->start + offset);
 	if (pfn)
-		*pfn = __pfn_to_pfn_t(PFN_DOWN(dev_info->start + offset),
-				      PFN_DEV);
+		*pfn = PFN_DOWN(dev_info->start + offset);
 
 	return (dev_sz - offset) / PAGE_SIZE;
 }
 
-static long
-dcssblk_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
-		long nr_pages, enum dax_access_mode mode, void **kaddr,
-		pfn_t *pfn)
+static long dcssblk_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
+				      long nr_pages, enum dax_access_mode mode,
+				      void **kaddr, unsigned long *pfn)
 {
 	struct dcssblk_dev_info *dev_info = dax_get_private(dax_dev);
 
diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index b84d1747a020..ba7f7ca2aebc 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -17,7 +17,6 @@
 #include <linux/fs.h>
 #include <linux/file.h>
 #include <linux/pagemap.h>
-#include <linux/pfn_t.h>
 #include <linux/ramfs.h>
 #include <linux/init.h>
 #include <linux/string.h>
@@ -412,7 +411,8 @@ static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma)
 		for (i = 0; i < pages && !ret; i++) {
 			vm_fault_t vmf;
 			unsigned long off = i * PAGE_SIZE;
-			pfn_t pfn = phys_to_pfn_t(address + off, PFN_DEV);
+			unsigned long pfn = PHYS_PFN(address + off);
+
 			vmf = vmf_insert_mixed(vma, vma->vm_start + off, pfn);
 			if (vmf & VM_FAULT_ERROR)
 				ret = vm_fault_to_errno(vmf, 0);
diff --git a/fs/dax.c b/fs/dax.c
index 72d6d4586330..fcbe62bde685 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -20,7 +20,6 @@
 #include <linux/sched/signal.h>
 #include <linux/uio.h>
 #include <linux/vmstat.h>
-#include <linux/pfn_t.h>
 #include <linux/sizes.h>
 #include <linux/mmu_notifier.h>
 #include <linux/iomap.h>
@@ -76,9 +75,9 @@ static struct folio *dax_to_folio(void *entry)
 	return page_folio(pfn_to_page(dax_to_pfn(entry)));
 }
 
-static void *dax_make_entry(pfn_t pfn, unsigned long flags)
+static void *dax_make_entry(unsigned long pfn, unsigned long flags)
 {
-	return xa_mk_value(flags | (pfn_t_to_pfn(pfn) << DAX_SHIFT));
+	return xa_mk_value(flags | (pfn << DAX_SHIFT));
 }
 
 static bool dax_is_locked(void *entry)
@@ -612,7 +611,7 @@ static void *grab_mapping_entry(struct xa_state *xas,
 
 		if (order > 0)
 			flags |= DAX_PMD;
-		entry = dax_make_entry(pfn_to_pfn_t(0), flags);
+		entry = dax_make_entry(0, flags);
 		dax_lock_entry(xas, entry);
 		if (xas_error(xas))
 			goto out_unlock;
@@ -837,7 +836,7 @@ static bool dax_fault_is_synchronous(const struct iomap_iter *iter,
  * appropriate.
  */
 static void *dax_insert_entry(struct xa_state *xas, struct vm_fault *vmf,
-		const struct iomap_iter *iter, void *entry, pfn_t pfn,
+		const struct iomap_iter *iter, void *entry, unsigned long pfn,
 		unsigned long flags)
 {
 	struct address_space *mapping = vmf->vma->vm_file->f_mapping;
@@ -1036,7 +1035,8 @@ int dax_writeback_mapping_range(struct address_space *mapping,
 EXPORT_SYMBOL_GPL(dax_writeback_mapping_range);
 
 static int dax_iomap_direct_access(const struct iomap *iomap, loff_t pos,
-		size_t size, void **kaddr, pfn_t *pfnp)
+				   size_t size, void **kaddr,
+				   unsigned long *pfnp)
 {
 	pgoff_t pgoff = dax_iomap_pgoff(iomap, pos);
 	int id, rc = 0;
@@ -1054,7 +1054,7 @@ static int dax_iomap_direct_access(const struct iomap *iomap, loff_t pos,
 	rc = -EINVAL;
 	if (PFN_PHYS(length) < size)
 		goto out;
-	if (pfn_t_to_pfn(*pfnp) & (PHYS_PFN(size)-1))
+	if (*pfnp & (PHYS_PFN(size)-1))
 		goto out;
 
 	rc = 0;
@@ -1158,8 +1158,8 @@ static vm_fault_t dax_load_hole(struct xa_state *xas, struct vm_fault *vmf,
 {
 	struct inode *inode = iter->inode;
 	unsigned long vaddr = vmf->address;
-	pfn_t pfn = pfn_to_pfn_t(my_zero_pfn(vaddr));
-	struct page *page = pfn_t_to_page(pfn);
+	unsigned long pfn = my_zero_pfn(vaddr);
+	struct page *page = pfn_to_page(pfn);
 	vm_fault_t ret;
 
 	*entry = dax_insert_entry(xas, vmf, iter, *entry, pfn, DAX_ZERO_PAGE);
@@ -1183,7 +1183,7 @@ static vm_fault_t dax_pmd_load_hole(struct xa_state *xas, struct vm_fault *vmf,
 	struct folio *zero_folio;
 	spinlock_t *ptl;
 	pmd_t pmd_entry;
-	pfn_t pfn;
+	unsigned long pfn;
 
 	if (arch_needs_pgtable_deposit()) {
 		pgtable = pte_alloc_one(vma->vm_mm);
@@ -1195,7 +1195,7 @@ static vm_fault_t dax_pmd_load_hole(struct xa_state *xas, struct vm_fault *vmf,
 	if (unlikely(!zero_folio))
 		goto fallback;
 
-	pfn = page_to_pfn_t(&zero_folio->page);
+	pfn = page_to_pfn(&zero_folio->page);
 	*entry = dax_insert_entry(xas, vmf, iter, *entry, pfn,
 				  DAX_PMD | DAX_ZERO_PAGE);
 
@@ -1564,7 +1564,7 @@ static vm_fault_t dax_fault_return(int error)
  * insertion for now and return the pfn so that caller can insert it after the
  * fsync is done.
  */
-static vm_fault_t dax_fault_synchronous_pfnp(pfn_t *pfnp, pfn_t pfn)
+static vm_fault_t dax_fault_synchronous_pfnp(unsigned long *pfnp, unsigned long pfn)
 {
 	if (WARN_ON_ONCE(!pfnp))
 		return VM_FAULT_SIGBUS;
@@ -1612,8 +1612,9 @@ static vm_fault_t dax_fault_cow_page(struct vm_fault *vmf,
  * @pmd:	distinguish whether it is a pmd fault
  */
 static vm_fault_t dax_fault_iter(struct vm_fault *vmf,
-		const struct iomap_iter *iter, pfn_t *pfnp,
-		struct xa_state *xas, void **entry, bool pmd)
+				 const struct iomap_iter *iter,
+				 unsigned long *pfnp, struct xa_state *xas,
+				 void **entry, bool pmd)
 {
 	const struct iomap *iomap = &iter->iomap;
 	const struct iomap *srcmap = iomap_iter_srcmap(iter);
@@ -1622,7 +1623,7 @@ static vm_fault_t dax_fault_iter(struct vm_fault *vmf,
 	bool write = iter->flags & IOMAP_WRITE;
 	unsigned long entry_flags = pmd ? DAX_PMD : 0;
 	int ret, err = 0;
-	pfn_t pfn;
+	unsigned long pfn;
 	void *kaddr;
 	struct page *page;
 
@@ -1657,7 +1658,7 @@ static vm_fault_t dax_fault_iter(struct vm_fault *vmf,
 	if (dax_fault_is_synchronous(iter, vmf->vma))
 		return dax_fault_synchronous_pfnp(pfnp, pfn);
 
-	page = pfn_t_to_page(pfn);
+	page = pfn_to_page(pfn);
 	page_ref_inc(page);
 
 	if (pmd)
@@ -1674,8 +1675,9 @@ static vm_fault_t dax_fault_iter(struct vm_fault *vmf,
 	return ret;
 }
 
-static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp,
-			       int *iomap_errp, const struct iomap_ops *ops)
+static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, unsigned long *pfnp,
+				      int *iomap_errp,
+				      const struct iomap_ops *ops)
 {
 	struct address_space *mapping = vmf->vma->vm_file->f_mapping;
 	XA_STATE(xas, &mapping->i_pages, vmf->pgoff);
@@ -1784,7 +1786,7 @@ static bool dax_fault_check_fallback(struct vm_fault *vmf, struct xa_state *xas,
 	return false;
 }
 
-static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp,
+static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, unsigned long *pfnp,
 			       const struct iomap_ops *ops)
 {
 	struct address_space *mapping = vmf->vma->vm_file->f_mapping;
@@ -1863,8 +1865,8 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp,
 	return ret;
 }
 #else
-static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp,
-			       const struct iomap_ops *ops)
+static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, unsigned long *pfnp,
+				      const struct iomap_ops *ops)
 {
 	return VM_FAULT_FALLBACK;
 }
@@ -1884,7 +1886,8 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp,
  * successfully.
  */
 vm_fault_t dax_iomap_fault(struct vm_fault *vmf, unsigned int order,
-		    pfn_t *pfnp, int *iomap_errp, const struct iomap_ops *ops)
+			   unsigned long *pfnp, int *iomap_errp,
+			   const struct iomap_ops *ops)
 {
 	if (order == 0)
 		return dax_iomap_pte_fault(vmf, pfnp, iomap_errp, ops);
@@ -1905,7 +1908,7 @@ EXPORT_SYMBOL_GPL(dax_iomap_fault);
  * for an mmaped DAX file.  It also marks the page cache entry as dirty.
  */
 static vm_fault_t
-dax_insert_pfn_mkwrite(struct vm_fault *vmf, pfn_t pfn, unsigned int order)
+dax_insert_pfn_mkwrite(struct vm_fault *vmf, unsigned long pfn, unsigned int order)
 {
 	struct address_space *mapping = vmf->vma->vm_file->f_mapping;
 	XA_STATE_ORDER(xas, &mapping->i_pages, vmf->pgoff, order);
@@ -1927,7 +1930,7 @@ dax_insert_pfn_mkwrite(struct vm_fault *vmf, pfn_t pfn, unsigned int order)
 	xas_set_mark(&xas, PAGECACHE_TAG_DIRTY);
 	dax_lock_entry(&xas, entry);
 	xas_unlock_irq(&xas);
-	page = pfn_t_to_page(pfn);
+	page = pfn_to_page(pfn);
 	page_ref_inc(page);
 	if (order == 0)
 		ret = dax_insert_pfn(vmf, pfn, true);
@@ -1954,7 +1957,7 @@ dax_insert_pfn_mkwrite(struct vm_fault *vmf, pfn_t pfn, unsigned int order)
  * table entry.
  */
 vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf, unsigned int order,
-		pfn_t pfn)
+				 unsigned long pfn)
 {
 	int err;
 	loff_t start = ((loff_t)vmf->pgoff) << PAGE_SHIFT;
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index c89e434db6b7..13e939bcc7ac 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -722,7 +722,7 @@ static vm_fault_t ext4_dax_huge_fault(struct vm_fault *vmf, unsigned int order)
 	bool write = (vmf->flags & FAULT_FLAG_WRITE) &&
 		(vmf->vma->vm_flags & VM_SHARED);
 	struct address_space *mapping = vmf->vma->vm_file->f_mapping;
-	pfn_t pfn;
+	unsigned long pfn;
 
 	if (write) {
 		sb_start_pagefault(sb);
diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c
index da505956208f..0b6b440520da 100644
--- a/fs/fuse/dax.c
+++ b/fs/fuse/dax.c
@@ -10,7 +10,6 @@
 #include <linux/dax.h>
 #include <linux/uio.h>
 #include <linux/pagemap.h>
-#include <linux/pfn_t.h>
 #include <linux/iomap.h>
 #include <linux/interval_tree.h>
 
@@ -788,7 +787,7 @@ static vm_fault_t __fuse_dax_fault(struct vm_fault *vmf, unsigned int order,
 	vm_fault_t ret;
 	struct inode *inode = file_inode(vmf->vma->vm_file);
 	struct super_block *sb = inode->i_sb;
-	pfn_t pfn;
+	unsigned long pfn;
 	int error = 0;
 	struct fuse_conn *fc = get_fuse_conn(inode);
 	struct fuse_conn_dax *fcd = fc->dax;
diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index f79a94d148da..e49e2ae33206 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -9,7 +9,6 @@
 #include <linux/pci.h>
 #include <linux/interrupt.h>
 #include <linux/group_cpus.h>
-#include <linux/pfn_t.h>
 #include <linux/memremap.h>
 #include <linux/module.h>
 #include <linux/virtio.h>
@@ -866,7 +865,7 @@ static void virtio_fs_cleanup_vqs(struct virtio_device *vdev)
  */
 static long virtio_fs_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
 				    long nr_pages, enum dax_access_mode mode,
-				    void **kaddr, pfn_t *pfn)
+				    void **kaddr, unsigned long *pfn)
 {
 	struct virtio_fs *fs = dax_get_private(dax_dev);
 	phys_addr_t offset = PFN_PHYS(pgoff);
@@ -875,7 +874,7 @@ static long virtio_fs_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
 	if (kaddr)
 		*kaddr = fs->window_kaddr + offset;
 	if (pfn)
-		*pfn = phys_to_pfn_t(fs->window_phys_addr + offset, 0);
+		*pfn = PHYS_PFN(fs->window_phys_addr + offset);
 	return nr_pages > max_nr_pages ? max_nr_pages : nr_pages;
 }
 
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 4cdc54dc9686..47edb2785ad2 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1243,7 +1243,7 @@ xfs_dax_fault_locked(
 	bool			write_fault)
 {
 	vm_fault_t		ret;
-	pfn_t			pfn;
+	unsigned long		pfn;
 
 	if (!IS_ENABLED(CONFIG_FS_DAX)) {
 		ASSERT(0);
diff --git a/include/linux/dax.h b/include/linux/dax.h
index 0f6f355ec3b5..153dd2398178 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -26,7 +26,7 @@ struct dax_operations {
 	 * number of pages available for DAX at that pfn.
 	 */
 	long (*direct_access)(struct dax_device *, pgoff_t, long,
-			enum dax_access_mode, void **, pfn_t *);
+			      enum dax_access_mode, void **, unsigned long *);
 	/*
 	 * Validate whether this device is usable as an fsdax backing
 	 * device.
@@ -241,7 +241,8 @@ static inline void dax_read_unlock(int id)
 bool dax_alive(struct dax_device *dax_dev);
 void *dax_get_private(struct dax_device *dax_dev);
 long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages,
-		enum dax_access_mode mode, void **kaddr, pfn_t *pfn);
+		       enum dax_access_mode mode, void **kaddr,
+		       unsigned long *pfn);
 size_t dax_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
 		size_t bytes, struct iov_iter *i);
 size_t dax_copy_to_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
@@ -255,9 +256,10 @@ void dax_flush(struct dax_device *dax_dev, void *addr, size_t size);
 ssize_t dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
 		const struct iomap_ops *ops);
 vm_fault_t dax_iomap_fault(struct vm_fault *vmf, unsigned int order,
-		    pfn_t *pfnp, int *errp, const struct iomap_ops *ops);
-vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf,
-		unsigned int order, pfn_t pfn);
+			   unsigned long *pfnp, int *errp,
+			   const struct iomap_ops *ops);
+vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf, unsigned int order,
+				 unsigned long pfn);
 int dax_delete_mapping_entry(struct address_space *mapping, pgoff_t index);
 int dax_invalidate_mapping_entry_sync(struct address_space *mapping,
 				      pgoff_t index);
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index 53ca3a913d06..05fadca5b588 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -147,9 +147,10 @@ typedef int (*dm_busy_fn) (struct dm_target *ti);
  *  < 0 : error
  * >= 0 : the number of bytes accessible at the address
  */
-typedef long (*dm_dax_direct_access_fn) (struct dm_target *ti, pgoff_t pgoff,
-		long nr_pages, enum dax_access_mode node, void **kaddr,
-		pfn_t *pfn);
+typedef long (*dm_dax_direct_access_fn)(struct dm_target *ti, pgoff_t pgoff,
+					long nr_pages,
+					enum dax_access_mode node, void **kaddr,
+					unsigned long *pfn);
 typedef int (*dm_dax_zero_page_range_fn)(struct dm_target *ti, pgoff_t pgoff,
 		size_t nr_pages);
 
diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 79a24ac31080..a047379d94ad 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -38,10 +38,10 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 		    pmd_t *pmd, unsigned long addr, pgprot_t newprot,
 		    unsigned long cp_flags);
 
-vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write);
-vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write);
-vm_fault_t dax_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write);
-vm_fault_t dax_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write);
+vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, unsigned long pfn, bool write);
+vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, unsigned long pfn, bool write);
+vm_fault_t dax_insert_pfn_pmd(struct vm_fault *vmf, unsigned long pfn, bool write);
+vm_fault_t dax_insert_pfn_pud(struct vm_fault *vmf, unsigned long pfn, bool write);
 
 enum transparent_hugepage_flag {
 	TRANSPARENT_HUGEPAGE_UNSUPPORTED,
diff --git a/include/linux/mm.h b/include/linux/mm.h
index d9517e109ac3..41a419c549ef 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3437,14 +3437,15 @@ int vm_map_pages(struct vm_area_struct *vma, struct page **pages,
 				unsigned long num);
 int vm_map_pages_zero(struct vm_area_struct *vma, struct page **pages,
 				unsigned long num);
-vm_fault_t dax_insert_pfn(struct vm_fault *vmf, pfn_t pfn_t, bool write);
+vm_fault_t dax_insert_pfn(struct vm_fault *vmf, unsigned long pfn, bool write);
 vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
 			unsigned long pfn);
 vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
 			unsigned long pfn, pgprot_t pgprot);
 vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
-			pfn_t pfn);
-int vm_iomap_memory(struct vm_area_struct *vma, phys_addr_t start, unsigned long len);
+			    unsigned long pfn);
+int vm_iomap_memory(struct vm_area_struct *vma, phys_addr_t start,
+		    unsigned long len);
 
 static inline vm_fault_t vmf_insert_page(struct vm_area_struct *vma,
 				unsigned long addr, struct page *page)
diff --git a/include/linux/pfn.h b/include/linux/pfn.h
index 14bc053c53d8..482cf9a07fda 100644
--- a/include/linux/pfn.h
+++ b/include/linux/pfn.h
@@ -2,19 +2,6 @@
 #ifndef _LINUX_PFN_H_
 #define _LINUX_PFN_H_
 
-#ifndef __ASSEMBLY__
-#include <linux/types.h>
-
-/*
- * pfn_t: encapsulates a page-frame number that is optionally backed
- * by memmap (struct page).  Whether a pfn_t has a 'struct page'
- * backing is indicated by flags in the high bits of the value.
- */
-typedef struct {
-	u64 val;
-} pfn_t;
-#endif
-
 #define PFN_ALIGN(x)	(((unsigned long)(x) + (PAGE_SIZE - 1)) & PAGE_MASK)
 #define PFN_UP(x)	(((x) + PAGE_SIZE-1) >> PAGE_SHIFT)
 #define PFN_DOWN(x)	((x) >> PAGE_SHIFT)
diff --git a/include/linux/pfn_t.h b/include/linux/pfn_t.h
deleted file mode 100644
index 76e519b20553..000000000000
--- a/include/linux/pfn_t.h
+++ /dev/null
@@ -1,96 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _LINUX_PFN_T_H_
-#define _LINUX_PFN_T_H_
-#include <linux/mm.h>
-
-/*
- * PFN_FLAGS_MASK - mask of all the possible valid pfn_t flags
- * PFN_SG_CHAIN - pfn is a pointer to the next scatterlist entry
- * PFN_SG_LAST - pfn references a page and is the last scatterlist entry
- * PFN_DEV - pfn is not covered by system memmap by default
- * PFN_MAP - pfn has a dynamic page mapping established by a device driver
- */
-#define PFN_FLAGS_MASK (((u64) (~PAGE_MASK)) << (BITS_PER_LONG_LONG - PAGE_SHIFT))
-#define PFN_SG_CHAIN (1ULL << (BITS_PER_LONG_LONG - 1))
-#define PFN_SG_LAST (1ULL << (BITS_PER_LONG_LONG - 2))
-#define PFN_DEV (1ULL << (BITS_PER_LONG_LONG - 3))
-#define PFN_MAP (1ULL << (BITS_PER_LONG_LONG - 4))
-
-#define PFN_FLAGS_TRACE \
-	{ PFN_SG_CHAIN,	"SG_CHAIN" }, \
-	{ PFN_SG_LAST,	"SG_LAST" }, \
-	{ PFN_DEV,	"DEV" }, \
-	{ PFN_MAP,	"MAP" }
-
-static inline pfn_t __pfn_to_pfn_t(unsigned long pfn, u64 flags)
-{
-	pfn_t pfn_t = { .val = pfn | (flags & PFN_FLAGS_MASK), };
-
-	return pfn_t;
-}
-
-/* a default pfn to pfn_t conversion assumes that @pfn is pfn_valid() */
-static inline pfn_t pfn_to_pfn_t(unsigned long pfn)
-{
-	return __pfn_to_pfn_t(pfn, 0);
-}
-
-static inline pfn_t phys_to_pfn_t(phys_addr_t addr, u64 flags)
-{
-	return __pfn_to_pfn_t(addr >> PAGE_SHIFT, flags);
-}
-
-static inline bool pfn_t_has_page(pfn_t pfn)
-{
-	return (pfn.val & PFN_MAP) == PFN_MAP || (pfn.val & PFN_DEV) == 0;
-}
-
-static inline unsigned long pfn_t_to_pfn(pfn_t pfn)
-{
-	return pfn.val & ~PFN_FLAGS_MASK;
-}
-
-static inline struct page *pfn_t_to_page(pfn_t pfn)
-{
-	if (pfn_t_has_page(pfn))
-		return pfn_to_page(pfn_t_to_pfn(pfn));
-	return NULL;
-}
-
-static inline phys_addr_t pfn_t_to_phys(pfn_t pfn)
-{
-	return PFN_PHYS(pfn_t_to_pfn(pfn));
-}
-
-static inline pfn_t page_to_pfn_t(struct page *page)
-{
-	return pfn_to_pfn_t(page_to_pfn(page));
-}
-
-static inline int pfn_t_valid(pfn_t pfn)
-{
-	return pfn_valid(pfn_t_to_pfn(pfn));
-}
-
-#ifdef CONFIG_MMU
-static inline pte_t pfn_t_pte(pfn_t pfn, pgprot_t pgprot)
-{
-	return pfn_pte(pfn_t_to_pfn(pfn), pgprot);
-}
-#endif
-
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
-static inline pmd_t pfn_t_pmd(pfn_t pfn, pgprot_t pgprot)
-{
-	return pfn_pmd(pfn_t_to_pfn(pfn), pgprot);
-}
-
-#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
-static inline pud_t pfn_t_pud(pfn_t pfn, pgprot_t pgprot)
-{
-	return pfn_pud(pfn_t_to_pfn(pfn), pgprot);
-}
-#endif
-#endif
-
-#endif /* _LINUX_PFN_T_H_ */
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index f3a95e38872c..d51e87e1adae 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -1513,7 +1513,7 @@ static inline int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
  * by vmf_insert_pfn().
  */
 static inline void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
-				    pfn_t pfn)
+				    unsigned long pfn)
 {
 }
 
@@ -1549,7 +1549,7 @@ extern int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
 			   unsigned long pfn, unsigned long addr,
 			   unsigned long size);
 extern void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
-			     pfn_t pfn);
+			     unsigned long pfn);
 extern int track_pfn_copy(struct vm_area_struct *vma);
 extern void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
 			unsigned long size, bool mm_wr_locked);
diff --git a/include/trace/events/fs_dax.h b/include/trace/events/fs_dax.h
index 86fe6aecff1e..10f706e37040 100644
--- a/include/trace/events/fs_dax.h
+++ b/include/trace/events/fs_dax.h
@@ -104,14 +104,14 @@ DEFINE_PMD_LOAD_HOLE_EVENT(dax_pmd_load_hole_fallback);
 
 DECLARE_EVENT_CLASS(dax_pmd_insert_mapping_class,
 	TP_PROTO(struct inode *inode, struct vm_fault *vmf,
-		long length, pfn_t pfn, void *radix_entry),
+		long length, unsigned long pfn, void *radix_entry),
 	TP_ARGS(inode, vmf, length, pfn, radix_entry),
 	TP_STRUCT__entry(
 		__field(unsigned long, ino)
 		__field(unsigned long, vm_flags)
 		__field(unsigned long, address)
 		__field(long, length)
-		__field(u64, pfn_val)
+		__field(unsigned long, pfn)
 		__field(void *, radix_entry)
 		__field(dev_t, dev)
 		__field(int, write)
@@ -123,11 +123,11 @@ DECLARE_EVENT_CLASS(dax_pmd_insert_mapping_class,
 		__entry->address = vmf->address;
 		__entry->write = vmf->flags & FAULT_FLAG_WRITE;
 		__entry->length = length;
-		__entry->pfn_val = pfn.val;
+		__entry->pfn = pfn;
 		__entry->radix_entry = radix_entry;
 	),
 	TP_printk("dev %d:%d ino %#lx %s %s address %#lx length %#lx "
-			"pfn %#llx %s radix_entry %#lx",
+			"pfn %#lx radix_entry %#lx",
 		MAJOR(__entry->dev),
 		MINOR(__entry->dev),
 		__entry->ino,
@@ -135,9 +135,7 @@ DECLARE_EVENT_CLASS(dax_pmd_insert_mapping_class,
 		__entry->write ? "write" : "read",
 		__entry->address,
 		__entry->length,
-		__entry->pfn_val & ~PFN_FLAGS_MASK,
-		__print_flags_u64(__entry->pfn_val & PFN_FLAGS_MASK, "|",
-			PFN_FLAGS_TRACE),
+		__entry->pfn,
 		(unsigned long)__entry->radix_entry
 	)
 )
@@ -145,7 +143,7 @@ DECLARE_EVENT_CLASS(dax_pmd_insert_mapping_class,
 #define DEFINE_PMD_INSERT_MAPPING_EVENT(name) \
 DEFINE_EVENT(dax_pmd_insert_mapping_class, name, \
 	TP_PROTO(struct inode *inode, struct vm_fault *vmf, \
-		long length, pfn_t pfn, void *radix_entry), \
+		long length, unsigned long pfn, void *radix_entry), \
 	TP_ARGS(inode, vmf, length, pfn, radix_entry))
 
 DEFINE_PMD_INSERT_MAPPING_EVENT(dax_pmd_insert_mapping);
diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 1262148d97b7..ec8e8d746658 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -20,7 +20,6 @@
 #include <linux/mman.h>
 #include <linux/mm_types.h>
 #include <linux/module.h>
-#include <linux/pfn_t.h>
 #include <linux/printk.h>
 #include <linux/pgtable.h>
 #include <linux/random.h>
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 7c39950bfdae..ea65c2db2bb1 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -23,7 +23,6 @@
 #include <linux/mm_types.h>
 #include <linux/khugepaged.h>
 #include <linux/freezer.h>
-#include <linux/pfn_t.h>
 #include <linux/mman.h>
 #include <linux/memremap.h>
 #include <linux/pagemap.h>
@@ -1232,15 +1231,15 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf)
 }
 
 static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr,
-		pmd_t *pmd, pfn_t pfn, pgprot_t prot, bool write,
-		pgtable_t pgtable)
+			   pmd_t *pmd, unsigned long pfn, pgprot_t prot,
+			   bool write, pgtable_t pgtable)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	pmd_t entry;
 
 	if (!pmd_none(*pmd)) {
 		if (write) {
-			if (pmd_pfn(*pmd) != pfn_t_to_pfn(pfn)) {
+			if (pmd_pfn(*pmd) != pfn) {
 				WARN_ON_ONCE(!is_huge_zero_pmd(*pmd));
 				return;
 			}
@@ -1253,7 +1252,7 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr,
 		return;
 	}
 
-	entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));
+	entry = pmd_mkhuge(pfn_pmd(pfn, prot));
 	if (write) {
 		entry = pmd_mkyoung(pmd_mkdirty(entry));
 		entry = maybe_pmd_mkwrite(entry, vma);
@@ -1279,7 +1278,7 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr,
  *
  * Return: vm_fault_t value.
  */
-vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write)
+vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, unsigned long pfn, bool write)
 {
 	unsigned long addr = vmf->address & PMD_MASK;
 	struct vm_area_struct *vma = vmf->vma;
@@ -1316,7 +1315,7 @@ vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write)
 }
 EXPORT_SYMBOL_GPL(vmf_insert_pfn_pmd);
 
-vm_fault_t dax_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write)
+vm_fault_t dax_insert_pfn_pmd(struct vm_fault *vmf, unsigned long pfn, bool write)
 {
 	struct vm_area_struct *vma = vmf->vma;
 	unsigned long addr = vmf->address & PMD_MASK;
@@ -1339,7 +1338,7 @@ vm_fault_t dax_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write)
 
 	ptl = pmd_lock(mm, vmf->pmd);
 	if (pmd_none(*vmf->pmd)) {
-		page = pfn_t_to_page(pfn);
+		page = pfn_to_page(pfn);
 		folio = page_folio(page);
 		folio_get(folio);
 		folio_add_file_rmap_pmd(folio, page, vma);
@@ -1364,7 +1363,7 @@ static pud_t maybe_pud_mkwrite(pud_t pud, struct vm_area_struct *vma)
 }
 
 static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr,
-		pud_t *pud, pfn_t pfn, bool write)
+			   pud_t *pud, unsigned long pfn, bool write)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	pgprot_t prot = vma->vm_page_prot;
@@ -1372,7 +1371,7 @@ static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr,
 
 	if (!pud_none(*pud)) {
 		if (write) {
-			if (pud_pfn(*pud) != pfn_t_to_pfn(pfn)) {
+			if (pud_pfn(*pud) != pfn) {
 				WARN_ON_ONCE(!is_huge_zero_pud(*pud));
 				return;
 			}
@@ -1384,7 +1383,7 @@ static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr,
 		return;
 	}
 
-	entry = pud_mkhuge(pfn_t_pud(pfn, prot));
+	entry = pud_mkhuge(pfn_pud(pfn, prot));
 	if (write) {
 		entry = pud_mkyoung(pud_mkdirty(entry));
 		entry = maybe_pud_mkwrite(entry, vma);
@@ -1403,7 +1402,7 @@ static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr,
  *
  * Return: vm_fault_t value.
  */
-vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write)
+vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, unsigned long pfn, bool write)
 {
 	unsigned long addr = vmf->address & PUD_MASK;
 	struct vm_area_struct *vma = vmf->vma;
@@ -1440,7 +1439,7 @@ EXPORT_SYMBOL_GPL(vmf_insert_pfn_pud);
  *
  * Return: vm_fault_t value.
  */
-vm_fault_t dax_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write)
+vm_fault_t dax_insert_pfn_pud(struct vm_fault *vmf, unsigned long pfn, bool write)
 {
 	struct vm_area_struct *vma = vmf->vma;
 	unsigned long addr = vmf->address & PUD_MASK;
@@ -1458,7 +1457,7 @@ vm_fault_t dax_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write)
 
 	ptl = pud_lock(mm, pud);
 	if (pud_none(*vmf->pud)) {
-		page = pfn_t_to_page(pfn);
+		page = pfn_to_page(pfn);
 		folio = page_folio(page);
 		folio_get(folio);
 		folio_add_file_rmap_pud(folio, page, vma);
diff --git a/mm/memory.c b/mm/memory.c
index 721aac02a636..ed75f561d445 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -59,7 +59,6 @@
 #include <linux/export.h>
 #include <linux/delayacct.h>
 #include <linux/init.h>
-#include <linux/pfn_t.h>
 #include <linux/writeback.h>
 #include <linux/memcontrol.h>
 #include <linux/mmu_notifier.h>
@@ -2327,7 +2326,7 @@ int vm_map_pages_zero(struct vm_area_struct *vma, struct page **pages,
 EXPORT_SYMBOL(vm_map_pages_zero);
 
 static vm_fault_t insert_pfn(struct vm_area_struct *vma, unsigned long addr,
-			     pfn_t pfn, pgprot_t prot)
+			     unsigned long pfn, pgprot_t prot)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	pte_t *pte, entry;
@@ -2341,7 +2340,7 @@ static vm_fault_t insert_pfn(struct vm_area_struct *vma, unsigned long addr,
 		goto out_unlock;
 
 	/* Ok, finally just insert the thing.. */
-	entry = pte_mkspecial(pfn_t_pte(pfn, prot));
+	entry = pte_mkspecial(pfn_pte(pfn, prot));
 
 	set_pte_at(mm, addr, pte, entry);
 	update_mmu_cache(vma, addr, pte); /* XXX: why not for insert_page? */
@@ -2385,7 +2384,7 @@ static vm_fault_t insert_pfn(struct vm_area_struct *vma, unsigned long addr,
  * Return: vm_fault_t value.
  */
 vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
-			unsigned long pfn, pgprot_t pgprot)
+			       unsigned long pfn, pgprot_t pgprot)
 {
 	/*
 	 * Technically, architectures with pte_special can avoid all these
@@ -2405,9 +2404,9 @@ vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
 	if (!pfn_modify_allowed(pfn, pgprot))
 		return VM_FAULT_SIGBUS;
 
-	track_pfn_insert(vma, &pgprot, __pfn_to_pfn_t(pfn, PFN_DEV));
+	track_pfn_insert(vma, &pgprot, pfn);
 
-	return insert_pfn(vma, addr, __pfn_to_pfn_t(pfn, PFN_DEV), pgprot);
+	return insert_pfn(vma, addr, pfn, pgprot);
 }
 EXPORT_SYMBOL(vmf_insert_pfn_prot);
 
@@ -2438,21 +2437,20 @@ vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
 }
 EXPORT_SYMBOL(vmf_insert_pfn);
 
-static bool vm_mixed_ok(struct vm_area_struct *vma, pfn_t pfn)
+static bool vm_mixed_ok(struct vm_area_struct *vma, unsigned long pfn)
 {
-	if (unlikely(is_zero_pfn(pfn_t_to_pfn(pfn))) &&
-	    !vm_mixed_zeropage_allowed(vma))
+	if (unlikely(is_zero_pfn(pfn)) && !vm_mixed_zeropage_allowed(vma))
 		return false;
 	/* these checks mirror the abort conditions in vm_normal_page */
 	if (vma->vm_flags & VM_MIXEDMAP)
 		return true;
-	if (is_zero_pfn(pfn_t_to_pfn(pfn)))
+	if (is_zero_pfn(pfn))
 		return true;
 	return false;
 }
 
 vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
-			    pfn_t pfn)
+			    unsigned long pfn)
 {
 	pgprot_t pgprot = vma->vm_page_prot;
 	int err;
@@ -2465,7 +2463,7 @@ vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
 
 	track_pfn_insert(vma, &pgprot, pfn);
 
-	if (!pfn_modify_allowed(pfn_t_to_pfn(pfn), pgprot))
+	if (!pfn_modify_allowed(pfn, pgprot))
 		return VM_FAULT_SIGBUS;
 
 	/*
@@ -2475,15 +2473,10 @@ vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
 	 * than insert_pfn).  If a zero_pfn were inserted into a VM_MIXEDMAP
 	 * without pte special, it would there be refcounted as a normal page.
 	 */
-	if (!IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL) && pfn_t_valid(pfn)) {
+	if (!IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL) && pfn_valid(pfn)) {
 		struct page *page;
 
-		/*
-		 * At this point we are committed to insert_page()
-		 * regardless of whether the caller specified flags that
-		 * result in pfn_t_has_page() == false.
-		 */
-		page = pfn_to_page(pfn_t_to_pfn(pfn));
+		page = pfn_to_page(pfn);
 		err = insert_page(vma, addr, page, pgprot, false);
 	} else {
 		return insert_pfn(vma, addr, pfn, pgprot);
@@ -2498,11 +2491,10 @@ vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
 }
 EXPORT_SYMBOL(vmf_insert_mixed);
 
-vm_fault_t dax_insert_pfn(struct vm_fault *vmf, pfn_t pfn_t, bool write)
+vm_fault_t dax_insert_pfn(struct vm_fault *vmf, unsigned long pfn, bool write)
 {
 	struct vm_area_struct *vma = vmf->vma;
 	pgprot_t pgprot = vma->vm_page_prot;
-	unsigned long pfn = pfn_t_to_pfn(pfn_t);
 	struct page *page = pfn_to_page(pfn);
 	unsigned long addr = vmf->address;
 	int err;
@@ -2510,7 +2502,7 @@ vm_fault_t dax_insert_pfn(struct vm_fault *vmf, pfn_t pfn_t, bool write)
 	if (addr < vma->vm_start || addr >= vma->vm_end)
 		return VM_FAULT_SIGBUS;
 
-	track_pfn_insert(vma, &pgprot, pfn_t);
+	track_pfn_insert(vma, &pgprot, pfn);
 
 	if (!pfn_modify_allowed(pfn, pgprot))
 		return VM_FAULT_SIGBUS;
@@ -2518,7 +2510,7 @@ vm_fault_t dax_insert_pfn(struct vm_fault *vmf, pfn_t pfn_t, bool write)
 	/*
 	 * We refcount the page normally so make sure pfn_valid is true.
 	 */
-	if (!pfn_t_valid(pfn_t))
+	if (!pfn_valid(pfn))
 		return VM_FAULT_SIGBUS;
 
 	if (WARN_ON(is_zero_pfn(pfn) && write))
diff --git a/mm/memremap.c b/mm/memremap.c
index 30bb99301b18..2b92195638db 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -5,7 +5,6 @@
 #include <linux/kasan.h>
 #include <linux/memory_hotplug.h>
 #include <linux/memremap.h>
-#include <linux/pfn_t.h>
 #include <linux/swap.h>
 #include <linux/mm.h>
 #include <linux/mmzone.h>
diff --git a/mm/migrate.c b/mm/migrate.c
index ba4893d42618..18d19ef24311 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -37,7 +37,6 @@
 #include <linux/hugetlb.h>
 #include <linux/hugetlb_cgroup.h>
 #include <linux/gfp.h>
-#include <linux/pfn_t.h>
 #include <linux/memremap.h>
 #include <linux/userfaultfd_k.h>
 #include <linux/balloon_compaction.h>
diff --git a/tools/testing/nvdimm/pmem-dax.c b/tools/testing/nvdimm/pmem-dax.c
index c1ec099a3b1d..f5ef3d034db5 100644
--- a/tools/testing/nvdimm/pmem-dax.c
+++ b/tools/testing/nvdimm/pmem-dax.c
@@ -9,8 +9,8 @@
 #include <nd.h>
 
 long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
-		long nr_pages, enum dax_access_mode mode, void **kaddr,
-		pfn_t *pfn)
+			  long nr_pages, enum dax_access_mode mode,
+			  void **kaddr, unsigned long *pfn)
 {
 	resource_size_t offset = PFN_PHYS(pgoff) + pmem->data_offset;
 
@@ -29,7 +29,7 @@ long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
 			*kaddr = pmem->virt_addr + offset;
 		page = vmalloc_to_page(pmem->virt_addr + offset);
 		if (pfn)
-			*pfn = page_to_pfn_t(page);
+			*pfn = page_to_pfn(page);
 		pr_debug_ratelimited("%s: pmem: %p pgoff: %#lx pfn: %#lx\n",
 				__func__, pmem, pgoff, page_to_pfn(page));
 
@@ -39,7 +39,7 @@ long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
 	if (kaddr)
 		*kaddr = pmem->virt_addr + offset;
 	if (pfn)
-		*pfn = phys_to_pfn_t(pmem->phys_addr + offset, pmem->pfn_flags);
+		*pfn = PHYS_PFN(pmem->phys_addr + offset);
 
 	/*
 	 * If badblocks are present, limit known good range to the
diff --git a/tools/testing/nvdimm/test/iomap.c b/tools/testing/nvdimm/test/iomap.c
index e4313726fae3..f7e7bfe9bb85 100644
--- a/tools/testing/nvdimm/test/iomap.c
+++ b/tools/testing/nvdimm/test/iomap.c
@@ -8,7 +8,6 @@
 #include <linux/ioport.h>
 #include <linux/module.h>
 #include <linux/types.h>
-#include <linux/pfn_t.h>
 #include <linux/acpi.h>
 #include <linux/io.h>
 #include <linux/mm.h>
@@ -135,16 +134,6 @@ void *__wrap_devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap)
 }
 EXPORT_SYMBOL_GPL(__wrap_devm_memremap_pages);
 
-pfn_t __wrap_phys_to_pfn_t(phys_addr_t addr, unsigned long flags)
-{
-	struct nfit_test_resource *nfit_res = get_nfit_res(addr);
-
-	if (nfit_res)
-		flags &= ~PFN_MAP;
-        return phys_to_pfn_t(addr, flags);
-}
-EXPORT_SYMBOL(__wrap_phys_to_pfn_t);
-
 void *__wrap_memremap(resource_size_t offset, size_t size,
 		unsigned long flags)
 {
diff --git a/tools/testing/nvdimm/test/nfit_test.h b/tools/testing/nvdimm/test/nfit_test.h
index b00583d1eace..b9047fb8ea4a 100644
--- a/tools/testing/nvdimm/test/nfit_test.h
+++ b/tools/testing/nvdimm/test/nfit_test.h
@@ -212,7 +212,6 @@ void __iomem *__wrap_devm_ioremap(struct device *dev,
 void *__wrap_devm_memremap(struct device *dev, resource_size_t offset,
 		size_t size, unsigned long flags);
 void *__wrap_devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap);
-pfn_t __wrap_phys_to_pfn_t(phys_addr_t addr, unsigned long flags);
 void *__wrap_memremap(resource_size_t offset, size_t size,
 		unsigned long flags);
 void __wrap_devm_memunmap(struct device *dev, void *addr);
kernel test robot Sept. 25, 2024, 4:29 a.m. UTC | #2
Hi Dan,

kernel test robot noticed the following build warnings:

[auto build test WARNING on s390/features]
[also build test WARNING on brauner-vfs/vfs.all akpm-mm/mm-everything linus/master v6.11 next-20240924]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Dan-Williams/dcssblk-Mark-DAX-broken/20240925-070047
base:   https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git features
patch link:    https://lore.kernel.org/r/172721874675.497781.3277495908107141898.stgit%40dwillia2-xfh.jf.intel.com
patch subject: [PATCH] dcssblk: Mark DAX broken
config: s390-allyesconfig (https://download.01.org/0day-ci/archive/20240925/202409251251.i8yVl4yR-lkp@intel.com/config)
compiler: s390-linux-gcc (GCC) 14.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240925/202409251251.i8yVl4yR-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409251251.i8yVl4yR-lkp@intel.com/

All warnings (new ones prefixed by >>):

   drivers/s390/block/dcssblk.c: In function 'dcssblk_add_store':
>> drivers/s390/block/dcssblk.c:571:28: warning: unused variable 'dax_dev' [-Wunused-variable]
     571 |         struct dax_device *dax_dev;
         |                            ^~~~~~~


vim +/dax_dev +571 drivers/s390/block/dcssblk.c

e265834f5da2c4 Dan Williams          2024-09-24  557  
^1da177e4c3f41 Linus Torvalds        2005-04-16  558  /*
^1da177e4c3f41 Linus Torvalds        2005-04-16  559   * device attribute for adding devices
^1da177e4c3f41 Linus Torvalds        2005-04-16  560   */
^1da177e4c3f41 Linus Torvalds        2005-04-16  561  static ssize_t
e404e274f62665 Yani Ioannou          2005-05-17  562  dcssblk_add_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count)
^1da177e4c3f41 Linus Torvalds        2005-04-16  563  {
af190c53c995bf Christoph Hellwig     2024-02-15  564  	struct queue_limits lim = {
af190c53c995bf Christoph Hellwig     2024-02-15  565  		.logical_block_size	= 4096,
f467fee48da450 Christoph Hellwig     2024-06-17  566  		.features		= BLK_FEAT_DAX,
af190c53c995bf Christoph Hellwig     2024-02-15  567  	};
b2300b9efe1b81 Hongjie Yang          2008-10-10  568  	int rc, i, j, num_of_segments;
^1da177e4c3f41 Linus Torvalds        2005-04-16  569  	struct dcssblk_dev_info *dev_info;
b2300b9efe1b81 Hongjie Yang          2008-10-10  570  	struct segment_info *seg_info, *temp;
cf7fe690abbbe5 Mathieu Desnoyers     2024-02-15 @571  	struct dax_device *dax_dev;
^1da177e4c3f41 Linus Torvalds        2005-04-16  572  	char *local_buf;
^1da177e4c3f41 Linus Torvalds        2005-04-16  573  	unsigned long seg_byte_size;
^1da177e4c3f41 Linus Torvalds        2005-04-16  574  
^1da177e4c3f41 Linus Torvalds        2005-04-16  575  	dev_info = NULL;
b2300b9efe1b81 Hongjie Yang          2008-10-10  576  	seg_info = NULL;
^1da177e4c3f41 Linus Torvalds        2005-04-16  577  	if (dev != dcssblk_root_dev) {
^1da177e4c3f41 Linus Torvalds        2005-04-16  578  		rc = -EINVAL;
^1da177e4c3f41 Linus Torvalds        2005-04-16  579  		goto out_nobuf;
^1da177e4c3f41 Linus Torvalds        2005-04-16  580  	}
b2300b9efe1b81 Hongjie Yang          2008-10-10  581  	if ((count < 1) || (buf[0] == '\0') || (buf[0] == '\n')) {
b2300b9efe1b81 Hongjie Yang          2008-10-10  582  		rc = -ENAMETOOLONG;
b2300b9efe1b81 Hongjie Yang          2008-10-10  583  		goto out_nobuf;
b2300b9efe1b81 Hongjie Yang          2008-10-10  584  	}
b2300b9efe1b81 Hongjie Yang          2008-10-10  585  
^1da177e4c3f41 Linus Torvalds        2005-04-16  586  	local_buf = kmalloc(count + 1, GFP_KERNEL);
^1da177e4c3f41 Linus Torvalds        2005-04-16  587  	if (local_buf == NULL) {
^1da177e4c3f41 Linus Torvalds        2005-04-16  588  		rc = -ENOMEM;
^1da177e4c3f41 Linus Torvalds        2005-04-16  589  		goto out_nobuf;
^1da177e4c3f41 Linus Torvalds        2005-04-16  590  	}
b2300b9efe1b81 Hongjie Yang          2008-10-10  591  
^1da177e4c3f41 Linus Torvalds        2005-04-16  592  	/*
^1da177e4c3f41 Linus Torvalds        2005-04-16  593  	 * parse input
^1da177e4c3f41 Linus Torvalds        2005-04-16  594  	 */
b2300b9efe1b81 Hongjie Yang          2008-10-10  595  	num_of_segments = 0;
3a9f9183bdd341 Ameen Ali             2015-02-24  596  	for (i = 0; (i < count && (buf[i] != '\0') && (buf[i] != '\n')); i++) {
42cfc6b590c5eb Martin Schwidefsky    2015-08-19  597  		for (j = i; j < count &&
42cfc6b590c5eb Martin Schwidefsky    2015-08-19  598  			(buf[j] != ':') &&
b2300b9efe1b81 Hongjie Yang          2008-10-10  599  			(buf[j] != '\0') &&
42cfc6b590c5eb Martin Schwidefsky    2015-08-19  600  			(buf[j] != '\n'); j++) {
b2300b9efe1b81 Hongjie Yang          2008-10-10  601  			local_buf[j-i] = toupper(buf[j]);
b2300b9efe1b81 Hongjie Yang          2008-10-10  602  		}
b2300b9efe1b81 Hongjie Yang          2008-10-10  603  		local_buf[j-i] = '\0';
b2300b9efe1b81 Hongjie Yang          2008-10-10  604  		if (((j - i) == 0) || ((j - i) > 8)) {
^1da177e4c3f41 Linus Torvalds        2005-04-16  605  			rc = -ENAMETOOLONG;
b2300b9efe1b81 Hongjie Yang          2008-10-10  606  			goto seg_list_del;
^1da177e4c3f41 Linus Torvalds        2005-04-16  607  		}
b2300b9efe1b81 Hongjie Yang          2008-10-10  608  
b2300b9efe1b81 Hongjie Yang          2008-10-10  609  		rc = dcssblk_load_segment(local_buf, &seg_info);
b2300b9efe1b81 Hongjie Yang          2008-10-10  610  		if (rc < 0)
b2300b9efe1b81 Hongjie Yang          2008-10-10  611  			goto seg_list_del;
^1da177e4c3f41 Linus Torvalds        2005-04-16  612  		/*
^1da177e4c3f41 Linus Torvalds        2005-04-16  613  		 * get a struct dcssblk_dev_info
^1da177e4c3f41 Linus Torvalds        2005-04-16  614  		 */
b2300b9efe1b81 Hongjie Yang          2008-10-10  615  		if (num_of_segments == 0) {
b2300b9efe1b81 Hongjie Yang          2008-10-10  616  			dev_info = kzalloc(sizeof(struct dcssblk_dev_info),
b2300b9efe1b81 Hongjie Yang          2008-10-10  617  					GFP_KERNEL);
^1da177e4c3f41 Linus Torvalds        2005-04-16  618  			if (dev_info == NULL) {
^1da177e4c3f41 Linus Torvalds        2005-04-16  619  				rc = -ENOMEM;
^1da177e4c3f41 Linus Torvalds        2005-04-16  620  				goto out;
^1da177e4c3f41 Linus Torvalds        2005-04-16  621  			}
^1da177e4c3f41 Linus Torvalds        2005-04-16  622  			strcpy(dev_info->segment_name, local_buf);
b2300b9efe1b81 Hongjie Yang          2008-10-10  623  			dev_info->segment_type = seg_info->segment_type;
b2300b9efe1b81 Hongjie Yang          2008-10-10  624  			INIT_LIST_HEAD(&dev_info->seg_list);
b2300b9efe1b81 Hongjie Yang          2008-10-10  625  		}
b2300b9efe1b81 Hongjie Yang          2008-10-10  626  		list_add_tail(&seg_info->lh, &dev_info->seg_list);
b2300b9efe1b81 Hongjie Yang          2008-10-10  627  		num_of_segments++;
b2300b9efe1b81 Hongjie Yang          2008-10-10  628  		i = j;
b2300b9efe1b81 Hongjie Yang          2008-10-10  629  
b2300b9efe1b81 Hongjie Yang          2008-10-10  630  		if ((buf[j] == '\0') || (buf[j] == '\n'))
b2300b9efe1b81 Hongjie Yang          2008-10-10  631  			break;
b2300b9efe1b81 Hongjie Yang          2008-10-10  632  	}
b2300b9efe1b81 Hongjie Yang          2008-10-10  633  
b2300b9efe1b81 Hongjie Yang          2008-10-10  634  	/* no trailing colon at the end of the input */
b2300b9efe1b81 Hongjie Yang          2008-10-10  635  	if ((i > 0) && (buf[i-1] == ':')) {
b2300b9efe1b81 Hongjie Yang          2008-10-10  636  		rc = -ENAMETOOLONG;
b2300b9efe1b81 Hongjie Yang          2008-10-10  637  		goto seg_list_del;
b2300b9efe1b81 Hongjie Yang          2008-10-10  638  	}
820109fb11f24b Wolfram Sang          2022-08-18  639  	strscpy(local_buf, buf, i + 1);
b2300b9efe1b81 Hongjie Yang          2008-10-10  640  	dev_info->num_of_segments = num_of_segments;
b2300b9efe1b81 Hongjie Yang          2008-10-10  641  	rc = dcssblk_is_continuous(dev_info);
b2300b9efe1b81 Hongjie Yang          2008-10-10  642  	if (rc < 0)
b2300b9efe1b81 Hongjie Yang          2008-10-10  643  		goto seg_list_del;
b2300b9efe1b81 Hongjie Yang          2008-10-10  644  
b2300b9efe1b81 Hongjie Yang          2008-10-10  645  	dev_info->start = dcssblk_find_lowest_addr(dev_info);
b2300b9efe1b81 Hongjie Yang          2008-10-10  646  	dev_info->end = dcssblk_find_highest_addr(dev_info);
b2300b9efe1b81 Hongjie Yang          2008-10-10  647  
ef283688f54cc8 Kees Cook             2014-06-10  648  	dev_set_name(&dev_info->dev, "%s", dev_info->segment_name);
^1da177e4c3f41 Linus Torvalds        2005-04-16  649  	dev_info->dev.release = dcssblk_release_segment;
521b3d790c16fa Sebastian Ott         2012-10-01  650  	dev_info->dev.groups = dcssblk_dev_attr_groups;
^1da177e4c3f41 Linus Torvalds        2005-04-16  651  	INIT_LIST_HEAD(&dev_info->lh);
af190c53c995bf Christoph Hellwig     2024-02-15  652  	dev_info->gd = blk_alloc_disk(&lim, NUMA_NO_NODE);
74fa8f9c553f7b Christoph Hellwig     2024-02-15  653  	if (IS_ERR(dev_info->gd)) {
74fa8f9c553f7b Christoph Hellwig     2024-02-15  654  		rc = PTR_ERR(dev_info->gd);
b2300b9efe1b81 Hongjie Yang          2008-10-10  655  		goto seg_list_del;
^1da177e4c3f41 Linus Torvalds        2005-04-16  656  	}
^1da177e4c3f41 Linus Torvalds        2005-04-16  657  	dev_info->gd->major = dcssblk_major;
0692ef289f067d Christoph Hellwig     2021-05-21  658  	dev_info->gd->minors = DCSSBLK_MINORS_PER_DISK;
^1da177e4c3f41 Linus Torvalds        2005-04-16  659  	dev_info->gd->fops = &dcssblk_devops;
^1da177e4c3f41 Linus Torvalds        2005-04-16  660  	dev_info->gd->private_data = dev_info;
a41a11b4009580 Gerald Schaefer       2022-10-27  661  	dev_info->gd->flags |= GENHD_FL_NO_PART;
b2300b9efe1b81 Hongjie Yang          2008-10-10  662  
^1da177e4c3f41 Linus Torvalds        2005-04-16  663  	seg_byte_size = (dev_info->end - dev_info->start + 1);
^1da177e4c3f41 Linus Torvalds        2005-04-16  664  	set_capacity(dev_info->gd, seg_byte_size >> 9); // size in sectors
93098bf0157876 Hongjie Yang          2008-12-25  665  	pr_info("Loaded %s with total size %lu bytes and capacity %lu "
93098bf0157876 Hongjie Yang          2008-12-25  666  		"sectors\n", local_buf, seg_byte_size, seg_byte_size >> 9);
^1da177e4c3f41 Linus Torvalds        2005-04-16  667  
^1da177e4c3f41 Linus Torvalds        2005-04-16  668  	dev_info->save_pending = 0;
^1da177e4c3f41 Linus Torvalds        2005-04-16  669  	dev_info->is_shared = 1;
^1da177e4c3f41 Linus Torvalds        2005-04-16  670  	dev_info->dev.parent = dcssblk_root_dev;
^1da177e4c3f41 Linus Torvalds        2005-04-16  671  
^1da177e4c3f41 Linus Torvalds        2005-04-16  672  	/*
^1da177e4c3f41 Linus Torvalds        2005-04-16  673  	 *get minor, add to list
^1da177e4c3f41 Linus Torvalds        2005-04-16  674  	 */
^1da177e4c3f41 Linus Torvalds        2005-04-16  675  	down_write(&dcssblk_devices_sem);
b2300b9efe1b81 Hongjie Yang          2008-10-10  676  	if (dcssblk_get_segment_by_name(local_buf)) {
04f64b5756872b Gerald Schaefer       2008-08-21  677  		rc = -EEXIST;
b2300b9efe1b81 Hongjie Yang          2008-10-10  678  		goto release_gd;
04f64b5756872b Gerald Schaefer       2008-08-21  679  	}
^1da177e4c3f41 Linus Torvalds        2005-04-16  680  	rc = dcssblk_assign_free_minor(dev_info);
b2300b9efe1b81 Hongjie Yang          2008-10-10  681  	if (rc)
b2300b9efe1b81 Hongjie Yang          2008-10-10  682  		goto release_gd;
^1da177e4c3f41 Linus Torvalds        2005-04-16  683  	sprintf(dev_info->gd->disk_name, "dcssblk%d",
d0591485e15ccd Gerald Schaefer       2009-06-12  684  		dev_info->gd->first_minor);
^1da177e4c3f41 Linus Torvalds        2005-04-16  685  	list_add_tail(&dev_info->lh, &dcssblk_devices);
^1da177e4c3f41 Linus Torvalds        2005-04-16  686  
^1da177e4c3f41 Linus Torvalds        2005-04-16  687  	if (!try_module_get(THIS_MODULE)) {
^1da177e4c3f41 Linus Torvalds        2005-04-16  688  		rc = -ENODEV;
b2300b9efe1b81 Hongjie Yang          2008-10-10  689  		goto dev_list_del;
^1da177e4c3f41 Linus Torvalds        2005-04-16  690  	}
^1da177e4c3f41 Linus Torvalds        2005-04-16  691  	/*
^1da177e4c3f41 Linus Torvalds        2005-04-16  692  	 * register the device
^1da177e4c3f41 Linus Torvalds        2005-04-16  693  	 */
^1da177e4c3f41 Linus Torvalds        2005-04-16  694  	rc = device_register(&dev_info->dev);
b2300b9efe1b81 Hongjie Yang          2008-10-10  695  	if (rc)
521b3d790c16fa Sebastian Ott         2012-10-01  696  		goto put_dev;
^1da177e4c3f41 Linus Torvalds        2005-04-16  697  
e265834f5da2c4 Dan Williams          2024-09-24  698  	rc = dcssblk_setup_dax(dev_info);
fb08a1908cb119 Christoph Hellwig     2021-11-29  699  	if (rc)
fb08a1908cb119 Christoph Hellwig     2021-11-29  700  		goto out_dax;
7a2765f6e82063 Dan Williams          2017-01-26  701  
521b3d790c16fa Sebastian Ott         2012-10-01  702  	get_device(&dev_info->dev);
1a5db707c859a4 Gerald Schaefer       2021-09-27  703  	rc = device_add_disk(&dev_info->dev, dev_info->gd, NULL);
1a5db707c859a4 Gerald Schaefer       2021-09-27  704  	if (rc)
fb08a1908cb119 Christoph Hellwig     2021-11-29  705  		goto out_dax_host;
436d1bc7fe6e78 Christian Borntraeger 2007-12-04  706  
^1da177e4c3f41 Linus Torvalds        2005-04-16  707  	switch (dev_info->segment_type) {
^1da177e4c3f41 Linus Torvalds        2005-04-16  708  		case SEG_TYPE_SR:
^1da177e4c3f41 Linus Torvalds        2005-04-16  709  		case SEG_TYPE_ER:
^1da177e4c3f41 Linus Torvalds        2005-04-16  710  		case SEG_TYPE_SC:
^1da177e4c3f41 Linus Torvalds        2005-04-16  711  			set_disk_ro(dev_info->gd,1);
^1da177e4c3f41 Linus Torvalds        2005-04-16  712  			break;
^1da177e4c3f41 Linus Torvalds        2005-04-16  713  		default:
^1da177e4c3f41 Linus Torvalds        2005-04-16  714  			set_disk_ro(dev_info->gd,0);
^1da177e4c3f41 Linus Torvalds        2005-04-16  715  			break;
^1da177e4c3f41 Linus Torvalds        2005-04-16  716  	}
^1da177e4c3f41 Linus Torvalds        2005-04-16  717  	up_write(&dcssblk_devices_sem);
^1da177e4c3f41 Linus Torvalds        2005-04-16  718  	rc = count;
^1da177e4c3f41 Linus Torvalds        2005-04-16  719  	goto out;
^1da177e4c3f41 Linus Torvalds        2005-04-16  720  
fb08a1908cb119 Christoph Hellwig     2021-11-29  721  out_dax_host:
c8f40a0bccefd6 Gerald Schaefer       2023-08-10  722  	put_device(&dev_info->dev);
fb08a1908cb119 Christoph Hellwig     2021-11-29  723  	dax_remove_host(dev_info->gd);
1a5db707c859a4 Gerald Schaefer       2021-09-27  724  out_dax:
1a5db707c859a4 Gerald Schaefer       2021-09-27  725  	kill_dax(dev_info->dax_dev);
1a5db707c859a4 Gerald Schaefer       2021-09-27  726  	put_dax(dev_info->dax_dev);
521b3d790c16fa Sebastian Ott         2012-10-01  727  put_dev:
^1da177e4c3f41 Linus Torvalds        2005-04-16  728  	list_del(&dev_info->lh);
8b9ab62662048a Christoph Hellwig     2022-06-19  729  	put_disk(dev_info->gd);
b2300b9efe1b81 Hongjie Yang          2008-10-10  730  	list_for_each_entry(seg_info, &dev_info->seg_list, lh) {
b2300b9efe1b81 Hongjie Yang          2008-10-10  731  		segment_unload(seg_info->segment_name);
b2300b9efe1b81 Hongjie Yang          2008-10-10  732  	}
^1da177e4c3f41 Linus Torvalds        2005-04-16  733  	put_device(&dev_info->dev);
^1da177e4c3f41 Linus Torvalds        2005-04-16  734  	up_write(&dcssblk_devices_sem);
^1da177e4c3f41 Linus Torvalds        2005-04-16  735  	goto out;
b2300b9efe1b81 Hongjie Yang          2008-10-10  736  dev_list_del:
^1da177e4c3f41 Linus Torvalds        2005-04-16  737  	list_del(&dev_info->lh);
b2300b9efe1b81 Hongjie Yang          2008-10-10  738  release_gd:
8b9ab62662048a Christoph Hellwig     2022-06-19  739  	put_disk(dev_info->gd);
b2300b9efe1b81 Hongjie Yang          2008-10-10  740  	up_write(&dcssblk_devices_sem);
b2300b9efe1b81 Hongjie Yang          2008-10-10  741  seg_list_del:
b2300b9efe1b81 Hongjie Yang          2008-10-10  742  	if (dev_info == NULL)
b2300b9efe1b81 Hongjie Yang          2008-10-10  743  		goto out;
b2300b9efe1b81 Hongjie Yang          2008-10-10  744  	list_for_each_entry_safe(seg_info, temp, &dev_info->seg_list, lh) {
b2300b9efe1b81 Hongjie Yang          2008-10-10  745  		list_del(&seg_info->lh);
b2300b9efe1b81 Hongjie Yang          2008-10-10  746  		segment_unload(seg_info->segment_name);
b2300b9efe1b81 Hongjie Yang          2008-10-10  747  		kfree(seg_info);
b2300b9efe1b81 Hongjie Yang          2008-10-10  748  	}
^1da177e4c3f41 Linus Torvalds        2005-04-16  749  	kfree(dev_info);
^1da177e4c3f41 Linus Torvalds        2005-04-16  750  out:
^1da177e4c3f41 Linus Torvalds        2005-04-16  751  	kfree(local_buf);
^1da177e4c3f41 Linus Torvalds        2005-04-16  752  out_nobuf:
^1da177e4c3f41 Linus Torvalds        2005-04-16  753  	return rc;
^1da177e4c3f41 Linus Torvalds        2005-04-16  754  }
^1da177e4c3f41 Linus Torvalds        2005-04-16  755
Alexander Gordeev Sept. 26, 2024, 5:58 p.m. UTC | #3
On Tue, Sep 24, 2024 at 03:59:08PM -0700, Dan Williams wrote:

Hi Dan,

> The dcssblk driver has long needed special case supoprt to enable
> limited dax operation, so called CONFIG_FS_DAX_LIMITED. This mode
> works around the incomplete support for ZONE_DEVICE on s390 by forgoing
> the ability of dax-mapped pages to support GUP.
> 
> Now, pending cleanups to fsdax that fix its reference counting [1] depend on
> the ability of all dax drivers to supply ZONE_DEVICE pages.
> 
> To allow that work to move forward, dax support needs to be paused for
> dcssblk until ZONE_DEVICE support arrives. That work has been known for
> a few years [2], and the removal of "pte_devmap" requirements [3] makes the
> conversion easier.
> 
> For now, place the support behind CONFIG_BROKEN, and remove PFN_SPECIAL
> (dcssblk was the only user).
> 
> Link: http://lore.kernel.org/cover.9f0e45d52f5cff58807831b6b867084d0b14b61c.1725941415.git-series.apopple@nvidia.com [1]
> Link: http://lore.kernel.org/20210820210318.187742e8@thinkpad/ [2]
> Link: http://lore.kernel.org/4511465a4f8429f45e2ac70d2e65dc5e1df1eb47.1725941415.git-series.apopple@nvidia.com [3]
...
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/s390/block/Kconfig   |   12 ++++++++++--
>  drivers/s390/block/dcssblk.c |   26 +++++++++++++++++---------
>  fs/Kconfig                   |    9 +--------
>  fs/dax.c                     |   12 ------------
>  include/linux/pfn_t.h        |   15 ---------------
>  mm/memory.c                  |    2 --
>  mm/memremap.c                |    4 ----
>  7 files changed, 28 insertions(+), 52 deletions(-)

...

I guess you want to remove dcssblk from Documentation/filesystems/dax.rst.
Gerald is back from vacation on Monday and he will likely comment on this.

Tested-by: Alexander Gordeev <agordeev@linux.ibm.com>

Thanks!
David Hildenbrand Sept. 27, 2024, 12:31 p.m. UTC | #4
On 25.09.24 00:59, Dan Williams wrote:
> The dcssblk driver has long needed special case supoprt to enable
> limited dax operation, so called CONFIG_FS_DAX_LIMITED. This mode
> works around the incomplete support for ZONE_DEVICE on s390 by forgoing
> the ability of dax-mapped pages to support GUP.
> 
> Now, pending cleanups to fsdax that fix its reference counting [1] depend on
> the ability of all dax drivers to supply ZONE_DEVICE pages.
> 
> To allow that work to move forward, dax support needs to be paused for
> dcssblk until ZONE_DEVICE support arrives. That work has been known for
> a few years [2], and the removal of "pte_devmap" requirements [3] makes the
> conversion easier.
> 
> For now, place the support behind CONFIG_BROKEN, and remove PFN_SPECIAL
> (dcssblk was the only user).
> 
> Link: http://lore.kernel.org/cover.9f0e45d52f5cff58807831b6b867084d0b14b61c.1725941415.git-series.apopple@nvidia.com [1]
> Link: http://lore.kernel.org/20210820210318.187742e8@thinkpad/ [2]
> Link: http://lore.kernel.org/4511465a4f8429f45e2ac70d2e65dc5e1df1eb47.1725941415.git-series.apopple@nvidia.com [3]
> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
> Cc: Heiko Carstens <hca@linux.ibm.com>
> Cc: Vasily Gorbik <gor@linux.ibm.com>
> Cc: Alexander Gordeev <agordeev@linux.ibm.com>
> Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
> Cc: Sven Schnelle <svens@linux.ibm.com>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Alistair Popple <apopple@nvidia.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---

Acked-by: David Hildenbrand <david@redhat.com>
Gerald Schaefer Oct. 1, 2024, 10:29 a.m. UTC | #5
On Tue, 24 Sep 2024 15:59:08 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> The dcssblk driver has long needed special case supoprt to enable
> limited dax operation, so called CONFIG_FS_DAX_LIMITED. This mode
> works around the incomplete support for ZONE_DEVICE on s390 by forgoing
> the ability of dax-mapped pages to support GUP.
> 
> Now, pending cleanups to fsdax that fix its reference counting [1] depend on
> the ability of all dax drivers to supply ZONE_DEVICE pages.
> 
> To allow that work to move forward, dax support needs to be paused for
> dcssblk until ZONE_DEVICE support arrives. That work has been known for
> a few years [2], and the removal of "pte_devmap" requirements [3] makes the
> conversion easier.

Thanks, that's great news! Without requiring the extra PTE bit, it should
now finally be possible to add struct pages and ZONE_DEVICE support for
dcssblk.

In the meantime, it is OK to pause the DAX support for dcssblk as you
suggested, and finally remove that ugly CONFIG_FS_DAX_LIMITED. Thanks
for bearing with us for so long!

> 
> For now, place the support behind CONFIG_BROKEN, and remove PFN_SPECIAL
> (dcssblk was the only user).

Ok, I guess that PFN_SPECIAL was there because we had no struct pages for
the DCSS memory. When we come back, with proper ZONE_DEVICE and struct
pages, it should not be needed any more.

And yes, the chance to completely remove pfn_t, after Alistair's series,
is quite impressive and even more motivation than CONFIG_FS_DAX_LIMITED.

> 
> Link: http://lore.kernel.org/cover.9f0e45d52f5cff58807831b6b867084d0b14b61c.1725941415.git-series.apopple@nvidia.com [1]
> Link: http://lore.kernel.org/20210820210318.187742e8@thinkpad/ [2]
> Link: http://lore.kernel.org/4511465a4f8429f45e2ac70d2e65dc5e1df1eb47.1725941415.git-series.apopple@nvidia.com [3]
> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
> Cc: Heiko Carstens <hca@linux.ibm.com>
> Cc: Vasily Gorbik <gor@linux.ibm.com>
> Cc: Alexander Gordeev <agordeev@linux.ibm.com>
> Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
> Cc: Sven Schnelle <svens@linux.ibm.com>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Alistair Popple <apopple@nvidia.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/s390/block/Kconfig   |   12 ++++++++++--
>  drivers/s390/block/dcssblk.c |   26 +++++++++++++++++---------
>  fs/Kconfig                   |    9 +--------
>  fs/dax.c                     |   12 ------------
>  include/linux/pfn_t.h        |   15 ---------------
>  mm/memory.c                  |    2 --
>  mm/memremap.c                |    4 ----
>  7 files changed, 28 insertions(+), 52 deletions(-)

When you also remove the now unused dax_dev definition at the top of
dcssblk_add_store(), as noticed by kernel test robot:

Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Dan Williams Oct. 8, 2024, 12:28 a.m. UTC | #6
Gerald Schaefer wrote:
[..]
> When you also remove the now unused dax_dev definition at the top of
> dcssblk_add_store(), as noticed by kernel test robot:

Yup, already have that fixup locally.

> Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>

Thanks!
diff mbox series

Patch

diff --git a/drivers/s390/block/Kconfig b/drivers/s390/block/Kconfig
index e3710a762aba..4bfe469c04aa 100644
--- a/drivers/s390/block/Kconfig
+++ b/drivers/s390/block/Kconfig
@@ -4,13 +4,21 @@  comment "S/390 block device drivers"
 
 config DCSSBLK
 	def_tristate m
-	select FS_DAX_LIMITED
-	select DAX
 	prompt "DCSSBLK support"
 	depends on S390 && BLOCK
 	help
 	  Support for dcss block device
 
+config DCSSBLK_DAX
+	def_bool y
+	depends on DCSSBLK
+	# requires S390 ZONE_DEVICE support
+	depends on BROKEN
+	select DAX
+	prompt "DCSSBLK DAX support"
+	help
+	  Enable DAX operation for the dcss block device
+
 config DASD
 	def_tristate y
 	prompt "Support for DASD devices"
diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c
index 02a4a51da1b7..d1bc79cf56bd 100644
--- a/drivers/s390/block/dcssblk.c
+++ b/drivers/s390/block/dcssblk.c
@@ -540,6 +540,21 @@  static const struct attribute_group *dcssblk_dev_attr_groups[] = {
 	NULL,
 };
 
+static int dcssblk_setup_dax(struct dcssblk_dev_info *dev_info)
+{
+	struct dax_device *dax_dev;
+
+	if (!IS_ENABLED(CONFIG_DCSSBLK_DAX))
+		return 0;
+
+	dax_dev = alloc_dax(dev_info, &dcssblk_dax_ops);
+	if (IS_ERR(dax_dev))
+		return PTR_ERR(dax_dev);
+	set_dax_synchronous(dax_dev);
+	dev_info->dax_dev = dax_dev;
+	return dax_add_host(dev_info->dax_dev, dev_info->gd);
+}
+
 /*
  * device attribute for adding devices
  */
@@ -680,14 +695,7 @@  dcssblk_add_store(struct device *dev, struct device_attribute *attr, const char
 	if (rc)
 		goto put_dev;
 
-	dax_dev = alloc_dax(dev_info, &dcssblk_dax_ops);
-	if (IS_ERR(dax_dev)) {
-		rc = PTR_ERR(dax_dev);
-		goto put_dev;
-	}
-	set_dax_synchronous(dax_dev);
-	dev_info->dax_dev = dax_dev;
-	rc = dax_add_host(dev_info->dax_dev, dev_info->gd);
+	rc = dcssblk_setup_dax(dev_info);
 	if (rc)
 		goto out_dax;
 
@@ -923,7 +931,7 @@  __dcssblk_direct_access(struct dcssblk_dev_info *dev_info, pgoff_t pgoff,
 		*kaddr = __va(dev_info->start + offset);
 	if (pfn)
 		*pfn = __pfn_to_pfn_t(PFN_DOWN(dev_info->start + offset),
-				PFN_DEV|PFN_SPECIAL);
+				      PFN_DEV);
 
 	return (dev_sz - offset) / PAGE_SIZE;
 }
diff --git a/fs/Kconfig b/fs/Kconfig
index 0e4efec1d92e..a6f4f28fa09e 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -60,7 +60,7 @@  endif # BLOCK
 config FS_DAX
 	bool "File system based Direct Access (DAX) support"
 	depends on MMU
-	depends on ZONE_DEVICE || FS_DAX_LIMITED
+	depends on ZONE_DEVICE
 	select FS_IOMAP
 	select DAX
 	help
@@ -96,13 +96,6 @@  config FS_DAX_PMD
 	depends on ZONE_DEVICE
 	depends on TRANSPARENT_HUGEPAGE
 
-# Selected by DAX drivers that do not expect filesystem DAX to support
-# get_user_pages() of DAX mappings. I.e. "limited" indicates no support
-# for fork() of processes with MAP_SHARED mappings or support for
-# direct-I/O to a DAX mapping.
-config FS_DAX_LIMITED
-	bool
-
 # Posix ACL utility routines
 #
 # Note: Posix ACLs can be implemented without these helpers.  Never use
diff --git a/fs/dax.c b/fs/dax.c
index becb4a6920c6..6257d3fdf8f8 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -359,9 +359,6 @@  static void dax_associate_entry(void *entry, struct address_space *mapping,
 	unsigned long size = dax_entry_size(entry), pfn, index;
 	int i = 0;
 
-	if (IS_ENABLED(CONFIG_FS_DAX_LIMITED))
-		return;
-
 	index = linear_page_index(vma, address & ~(size - 1));
 	for_each_mapped_pfn(entry, pfn) {
 		struct page *page = pfn_to_page(pfn);
@@ -381,9 +378,6 @@  static void dax_disassociate_entry(void *entry, struct address_space *mapping,
 {
 	unsigned long pfn;
 
-	if (IS_ENABLED(CONFIG_FS_DAX_LIMITED))
-		return;
-
 	for_each_mapped_pfn(entry, pfn) {
 		struct page *page = pfn_to_page(pfn);
 
@@ -684,12 +678,6 @@  struct page *dax_layout_busy_page_range(struct address_space *mapping,
 	pgoff_t end_idx;
 	XA_STATE(xas, &mapping->i_pages, start_idx);
 
-	/*
-	 * In the 'limited' case get_user_pages() for dax is disabled.
-	 */
-	if (IS_ENABLED(CONFIG_FS_DAX_LIMITED))
-		return NULL;
-
 	if (!dax_mapping(mapping) || !mapping_mapped(mapping))
 		return NULL;
 
diff --git a/include/linux/pfn_t.h b/include/linux/pfn_t.h
index 2d9148221e9a..eb8da94d1d19 100644
--- a/include/linux/pfn_t.h
+++ b/include/linux/pfn_t.h
@@ -9,18 +9,14 @@ 
  * PFN_SG_LAST - pfn references a page and is the last scatterlist entry
  * PFN_DEV - pfn is not covered by system memmap by default
  * PFN_MAP - pfn has a dynamic page mapping established by a device driver
- * PFN_SPECIAL - for CONFIG_FS_DAX_LIMITED builds to allow XIP, but not
- *		 get_user_pages
  */
 #define PFN_FLAGS_MASK (((u64) (~PAGE_MASK)) << (BITS_PER_LONG_LONG - PAGE_SHIFT))
 #define PFN_SG_CHAIN (1ULL << (BITS_PER_LONG_LONG - 1))
 #define PFN_SG_LAST (1ULL << (BITS_PER_LONG_LONG - 2))
 #define PFN_DEV (1ULL << (BITS_PER_LONG_LONG - 3))
 #define PFN_MAP (1ULL << (BITS_PER_LONG_LONG - 4))
-#define PFN_SPECIAL (1ULL << (BITS_PER_LONG_LONG - 5))
 
 #define PFN_FLAGS_TRACE \
-	{ PFN_SPECIAL,	"SPECIAL" }, \
 	{ PFN_SG_CHAIN,	"SG_CHAIN" }, \
 	{ PFN_SG_LAST,	"SG_LAST" }, \
 	{ PFN_DEV,	"DEV" }, \
@@ -117,15 +113,4 @@  pud_t pud_mkdevmap(pud_t pud);
 #endif
 #endif /* CONFIG_ARCH_HAS_PTE_DEVMAP */
 
-#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL
-static inline bool pfn_t_special(pfn_t pfn)
-{
-	return (pfn.val & PFN_SPECIAL) == PFN_SPECIAL;
-}
-#else
-static inline bool pfn_t_special(pfn_t pfn)
-{
-	return false;
-}
-#endif /* CONFIG_ARCH_HAS_PTE_SPECIAL */
 #endif /* _LINUX_PFN_T_H_ */
diff --git a/mm/memory.c b/mm/memory.c
index c31ea300cdf6..676f5cda992a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2462,8 +2462,6 @@  static bool vm_mixed_ok(struct vm_area_struct *vma, pfn_t pfn, bool mkwrite)
 		return true;
 	if (pfn_t_devmap(pfn))
 		return true;
-	if (pfn_t_special(pfn))
-		return true;
 	if (is_zero_pfn(pfn_t_to_pfn(pfn)))
 		return true;
 	return false;
diff --git a/mm/memremap.c b/mm/memremap.c
index 40d4547ce514..a6bbbe180eab 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -332,10 +332,6 @@  void *memremap_pages(struct dev_pagemap *pgmap, int nid)
 		}
 		break;
 	case MEMORY_DEVICE_FS_DAX:
-		if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) {
-			WARN(1, "File system DAX not supported\n");
-			return ERR_PTR(-EINVAL);
-		}
 		params.pgprot = pgprot_decrypted(params.pgprot);
 		break;
 	case MEMORY_DEVICE_GENERIC: