Message ID | 172721874675.497781.3277495908107141898.stgit@dwillia2-xfh.jf.intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | dcssblk: Mark DAX broken | expand |
Dan Williams wrote: > The dcssblk driver has long needed special case supoprt to enable > limited dax operation, so called CONFIG_FS_DAX_LIMITED. This mode > works around the incomplete support for ZONE_DEVICE on s390 by forgoing > the ability of dax-mapped pages to support GUP. > > Now, pending cleanups to fsdax that fix its reference counting [1] depend on > the ability of all dax drivers to supply ZONE_DEVICE pages. > > To allow that work to move forward, dax support needs to be paused for > dcssblk until ZONE_DEVICE support arrives. That work has been known for > a few years [2], and the removal of "pte_devmap" requirements [3] makes the > conversion easier. > > For now, place the support behind CONFIG_BROKEN, and remove PFN_SPECIAL > (dcssblk was the only user). > > Link: http://lore.kernel.org/cover.9f0e45d52f5cff58807831b6b867084d0b14b61c.1725941415.git-series.apopple@nvidia.com [1] > Link: http://lore.kernel.org/20210820210318.187742e8@thinkpad/ [2] > Link: http://lore.kernel.org/4511465a4f8429f45e2ac70d2e65dc5e1df1eb47.1725941415.git-series.apopple@nvidia.com [3] > Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> > Cc: Heiko Carstens <hca@linux.ibm.com> > Cc: Vasily Gorbik <gor@linux.ibm.com> > Cc: Alexander Gordeev <agordeev@linux.ibm.com> > Cc: Christian Borntraeger <borntraeger@linux.ibm.com> > Cc: Sven Schnelle <svens@linux.ibm.com> > Cc: Jan Kara <jack@suse.cz> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: Christoph Hellwig <hch@lst.de> > Cc: Alistair Popple <apopple@nvidia.com> > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > --- > drivers/s390/block/Kconfig | 12 ++++++++++-- > drivers/s390/block/dcssblk.c | 26 +++++++++++++++++--------- > fs/Kconfig | 9 +-------- > fs/dax.c | 12 ------------ > include/linux/pfn_t.h | 15 --------------- > mm/memory.c | 2 -- > mm/memremap.c | 4 ---- > 7 files changed, 28 insertions(+), 52 deletions(-) As additional motivation, with this addressed, pfn_t can also be removed for "moar red-diff!": 44 files changed, 141 insertions(+), 301 deletions(-) Patch below is on top of Alistair's series. It will need to be rebased on top of the final version of that, but here it is for demonstration purposes. -- >8 -- Subject: mm: Remove pfn_t From: Dan Williams <dan.j.williams@intel.com> The pfn_t type was created to convey mapping constraints from ->direct_acces() methods to core mm helpers like vmf_insert_mixed(). Now that all ->direct_access() helpers return ZONE_DEVICE pages, and ZONE_DEVICE pages no longer require pte_devmap, there is no longer a need for pfn_t. Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- arch/x86/mm/pat/memtype.c | 5 +- drivers/dax/device.c | 19 +++--- drivers/dax/hmem/hmem.c | 1 drivers/dax/kmem.c | 1 drivers/dax/pmem.c | 1 drivers/dax/pmem/pmem.c | 1 drivers/dax/super.c | 3 - drivers/gpu/drm/exynos/exynos_drm_gem.c | 1 drivers/gpu/drm/gma500/fbdev.c | 3 - drivers/gpu/drm/i915/gem/i915_gem_mman.c | 1 drivers/gpu/drm/msm/msm_gem.c | 1 drivers/gpu/drm/omapdrm/omap_gem.c | 7 +- drivers/gpu/drm/v3d/v3d_bo.c | 1 drivers/md/dm-linear.c | 4 + drivers/md/dm-log-writes.c | 5 +- drivers/md/dm-stripe.c | 4 + drivers/md/dm-target.c | 4 + drivers/md/dm-writecache.c | 16 +---- drivers/md/dm.c | 4 + drivers/nvdimm/pmem.c | 15 ++--- drivers/nvdimm/pmem.h | 6 +- drivers/s390/block/dcssblk.c | 21 +++---- fs/cramfs/inode.c | 4 + fs/dax.c | 53 +++++++++-------- fs/ext4/file.c | 2 - fs/fuse/dax.c | 3 - fs/fuse/virtio_fs.c | 5 +- fs/xfs/xfs_file.c | 2 - include/linux/dax.h | 12 ++-- include/linux/device-mapper.h | 7 +- include/linux/huge_mm.h | 8 +-- include/linux/mm.h | 7 +- include/linux/pfn.h | 13 ---- include/linux/pfn_t.h | 96 ------------------------------ include/linux/pgtable.h | 4 + include/trace/events/fs_dax.h | 14 ++-- mm/debug_vm_pgtable.c | 1 mm/huge_memory.c | 27 ++++---- mm/memory.c | 38 +++++------- mm/memremap.c | 1 mm/migrate.c | 1 tools/testing/nvdimm/pmem-dax.c | 8 +-- tools/testing/nvdimm/test/iomap.c | 11 --- tools/testing/nvdimm/test/nfit_test.h | 1 44 files changed, 141 insertions(+), 301 deletions(-) delete mode 100644 include/linux/pfn_t.h diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c index eb84593cf95c..da57ccb2da34 100644 --- a/arch/x86/mm/pat/memtype.c +++ b/arch/x86/mm/pat/memtype.c @@ -36,7 +36,6 @@ #include <linux/debugfs.h> #include <linux/ioport.h> #include <linux/kernel.h> -#include <linux/pfn_t.h> #include <linux/slab.h> #include <linux/mm.h> #include <linux/highmem.h> @@ -1074,7 +1073,7 @@ int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot, return 0; } -void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot, pfn_t pfn) +void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot, unsigned long pfn) { enum page_cache_mode pcm; @@ -1082,7 +1081,7 @@ void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot, pfn_t pfn) return; /* Set prot based on lookup */ - pcm = lookup_memtype(pfn_t_to_phys(pfn)); + pcm = lookup_memtype(PFN_PHYS(pfn)); *prot = __pgprot((pgprot_val(*prot) & (~_PAGE_CACHE_MASK)) | cachemode2protval(pcm)); } diff --git a/drivers/dax/device.c b/drivers/dax/device.c index 4d3ddd128790..aae90a5bcd30 100644 --- a/drivers/dax/device.c +++ b/drivers/dax/device.c @@ -4,7 +4,6 @@ #include <linux/pagemap.h> #include <linux/module.h> #include <linux/device.h> -#include <linux/pfn_t.h> #include <linux/cdev.h> #include <linux/slab.h> #include <linux/dax.h> @@ -73,8 +72,8 @@ __weak phys_addr_t dax_pgoff_to_phys(struct dev_dax *dev_dax, pgoff_t pgoff, return -1; } -static void dax_set_mapping(struct vm_fault *vmf, pfn_t pfn, - unsigned long fault_size) +static void dax_set_mapping(struct vm_fault *vmf, unsigned long pfn, + unsigned long fault_size) { unsigned long i, nr_pages = fault_size / PAGE_SIZE; struct file *filp = vmf->vma->vm_file; @@ -89,7 +88,7 @@ static void dax_set_mapping(struct vm_fault *vmf, pfn_t pfn, ALIGN(vmf->address, fault_size)); for (i = 0; i < nr_pages; i++) { - struct page *page = pfn_to_page(pfn_t_to_pfn(pfn) + i); + struct page *page = pfn_to_page(pfn + i); page = compound_head(page); if (page->mapping) @@ -105,7 +104,7 @@ static vm_fault_t __dev_dax_pte_fault(struct dev_dax *dev_dax, { struct device *dev = &dev_dax->dev; phys_addr_t phys; - pfn_t pfn; + unsigned long pfn; unsigned int fault_size = PAGE_SIZE; if (check_vma(dev_dax, vmf->vma, __func__)) @@ -126,7 +125,7 @@ static vm_fault_t __dev_dax_pte_fault(struct dev_dax *dev_dax, return VM_FAULT_SIGBUS; } - pfn = phys_to_pfn_t(phys, 0); + pfn = PHYS_PFN(phys); dax_set_mapping(vmf, pfn, fault_size); @@ -140,7 +139,7 @@ static vm_fault_t __dev_dax_pmd_fault(struct dev_dax *dev_dax, struct device *dev = &dev_dax->dev; phys_addr_t phys; pgoff_t pgoff; - pfn_t pfn; + unsigned long pfn; unsigned int fault_size = PMD_SIZE; if (check_vma(dev_dax, vmf->vma, __func__)) @@ -169,7 +168,7 @@ static vm_fault_t __dev_dax_pmd_fault(struct dev_dax *dev_dax, return VM_FAULT_SIGBUS; } - pfn = phys_to_pfn_t(phys, 0); + pfn = PHYS_PFN(phys); dax_set_mapping(vmf, pfn, fault_size); @@ -184,7 +183,7 @@ static vm_fault_t __dev_dax_pud_fault(struct dev_dax *dev_dax, struct device *dev = &dev_dax->dev; phys_addr_t phys; pgoff_t pgoff; - pfn_t pfn; + unsigned long pfn; unsigned int fault_size = PUD_SIZE; @@ -214,7 +213,7 @@ static vm_fault_t __dev_dax_pud_fault(struct dev_dax *dev_dax, return VM_FAULT_SIGBUS; } - pfn = phys_to_pfn_t(phys, 0); + pfn = PHYS_PFN(phys); dax_set_mapping(vmf, pfn, fault_size); diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c index 5e7c53f18491..c18451a37e4f 100644 --- a/drivers/dax/hmem/hmem.c +++ b/drivers/dax/hmem/hmem.c @@ -2,7 +2,6 @@ #include <linux/platform_device.h> #include <linux/memregion.h> #include <linux/module.h> -#include <linux/pfn_t.h> #include <linux/dax.h> #include "../bus.h" diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c index e97d47f42ee2..87b5321675ff 100644 --- a/drivers/dax/kmem.c +++ b/drivers/dax/kmem.c @@ -5,7 +5,6 @@ #include <linux/memory.h> #include <linux/module.h> #include <linux/device.h> -#include <linux/pfn_t.h> #include <linux/slab.h> #include <linux/dax.h> #include <linux/fs.h> diff --git a/drivers/dax/pmem.c b/drivers/dax/pmem.c index c8ebf4e281f2..bee93066a849 100644 --- a/drivers/dax/pmem.c +++ b/drivers/dax/pmem.c @@ -2,7 +2,6 @@ /* Copyright(c) 2016 - 2018 Intel Corporation. All rights reserved. */ #include <linux/memremap.h> #include <linux/module.h> -#include <linux/pfn_t.h> #include "../nvdimm/pfn.h" #include "../nvdimm/nd.h" #include "bus.h" diff --git a/drivers/dax/pmem/pmem.c b/drivers/dax/pmem/pmem.c index dfe91a2990fe..ce3394617d15 100644 --- a/drivers/dax/pmem/pmem.c +++ b/drivers/dax/pmem/pmem.c @@ -3,7 +3,6 @@ #include <linux/percpu-refcount.h> #include <linux/memremap.h> #include <linux/module.h> -#include <linux/pfn_t.h> #include <linux/nd.h> #include "../bus.h" diff --git a/drivers/dax/super.c b/drivers/dax/super.c index 57a94a6c00e5..3706d803acbf 100644 --- a/drivers/dax/super.c +++ b/drivers/dax/super.c @@ -7,7 +7,6 @@ #include <linux/mount.h> #include <linux/pseudo_fs.h> #include <linux/magic.h> -#include <linux/pfn_t.h> #include <linux/cdev.h> #include <linux/slab.h> #include <linux/uio.h> @@ -148,7 +147,7 @@ enum dax_device_flags { * pages accessible at the device relative @pgoff. */ long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages, - enum dax_access_mode mode, void **kaddr, pfn_t *pfn) + enum dax_access_mode mode, void **kaddr, unsigned long *pfn) { long avail; diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c index 638ca96830e9..ab8d6cea09f5 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c @@ -7,7 +7,6 @@ #include <linux/dma-buf.h> -#include <linux/pfn_t.h> #include <linux/shmem_fs.h> #include <linux/module.h> diff --git a/drivers/gpu/drm/gma500/fbdev.c b/drivers/gpu/drm/gma500/fbdev.c index 98b44974d42d..997c9038db38 100644 --- a/drivers/gpu/drm/gma500/fbdev.c +++ b/drivers/gpu/drm/gma500/fbdev.c @@ -6,7 +6,6 @@ **************************************************************************/ #include <linux/fb.h> -#include <linux/pfn_t.h> #include <drm/drm_crtc_helper.h> #include <drm/drm_drv.h> @@ -33,7 +32,7 @@ static vm_fault_t psb_fbdev_vm_fault(struct vm_fault *vmf) vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); for (i = 0; i < page_num; ++i) { - err = vmf_insert_mixed(vma, address, __pfn_to_pfn_t(pfn, PFN_DEV)); + err = vmf_insert_mixed(vma, address, pfn); if (unlikely(err & VM_FAULT_ERROR)) break; address += PAGE_SIZE; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c index cac6d4184506..4faab805909d 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -6,7 +6,6 @@ #include <linux/anon_inodes.h> #include <linux/mman.h> -#include <linux/pfn_t.h> #include <linux/sizes.h> #include <drm/drm_cache.h> diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index ebc9ba66efb8..1c275008b223 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -9,7 +9,6 @@ #include <linux/spinlock.h> #include <linux/shmem_fs.h> #include <linux/dma-buf.h> -#include <linux/pfn_t.h> #include <drm/drm_prime.h> #include <drm/drm_file.h> diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c b/drivers/gpu/drm/omapdrm/omap_gem.c index fdae677558f3..5523196f5b28 100644 --- a/drivers/gpu/drm/omapdrm/omap_gem.c +++ b/drivers/gpu/drm/omapdrm/omap_gem.c @@ -8,7 +8,6 @@ #include <linux/seq_file.h> #include <linux/shmem_fs.h> #include <linux/spinlock.h> -#include <linux/pfn_t.h> #include <linux/vmalloc.h> #include <drm/drm_prime.h> @@ -371,8 +370,7 @@ static vm_fault_t omap_gem_fault_1d(struct drm_gem_object *obj, VERB("Inserting %p pfn %lx, pa %lx", (void *)vmf->address, pfn, pfn << PAGE_SHIFT); - return vmf_insert_mixed(vma, vmf->address, - __pfn_to_pfn_t(pfn, PFN_DEV)); + return vmf_insert_mixed(vma, vmf->address, pfn); } /* Special handling for the case of faulting in 2d tiled buffers */ @@ -467,8 +465,7 @@ static vm_fault_t omap_gem_fault_2d(struct drm_gem_object *obj, pfn, pfn << PAGE_SHIFT); for (i = n; i > 0; i--) { - ret = vmf_insert_mixed(vma, - vaddr, __pfn_to_pfn_t(pfn, PFN_DEV)); + ret = vmf_insert_mixed(vma, vaddr, pfn); if (ret & VM_FAULT_ERROR) break; pfn += priv->usergart[fmt].stride_pfn; diff --git a/drivers/gpu/drm/v3d/v3d_bo.c b/drivers/gpu/drm/v3d/v3d_bo.c index a165cbcdd27b..091bc758b23a 100644 --- a/drivers/gpu/drm/v3d/v3d_bo.c +++ b/drivers/gpu/drm/v3d/v3d_bo.c @@ -20,7 +20,6 @@ */ #include <linux/dma-buf.h> -#include <linux/pfn_t.h> #include <linux/vmalloc.h> #include "v3d_drv.h" diff --git a/drivers/md/dm-linear.c b/drivers/md/dm-linear.c index 49fb0f684193..211528d1eebf 100644 --- a/drivers/md/dm-linear.c +++ b/drivers/md/dm-linear.c @@ -167,8 +167,8 @@ static struct dax_device *linear_dax_pgoff(struct dm_target *ti, pgoff_t *pgoff) } static long linear_dax_direct_access(struct dm_target *ti, pgoff_t pgoff, - long nr_pages, enum dax_access_mode mode, void **kaddr, - pfn_t *pfn) + long nr_pages, enum dax_access_mode mode, + void **kaddr, unsigned long *pfn) { struct dax_device *dax_dev = linear_dax_pgoff(ti, &pgoff); diff --git a/drivers/md/dm-log-writes.c b/drivers/md/dm-log-writes.c index 8d7df8303d0a..63037f0cd277 100644 --- a/drivers/md/dm-log-writes.c +++ b/drivers/md/dm-log-writes.c @@ -890,8 +890,9 @@ static struct dax_device *log_writes_dax_pgoff(struct dm_target *ti, } static long log_writes_dax_direct_access(struct dm_target *ti, pgoff_t pgoff, - long nr_pages, enum dax_access_mode mode, void **kaddr, - pfn_t *pfn) + long nr_pages, + enum dax_access_mode mode, + void **kaddr, unsigned long *pfn) { struct dax_device *dax_dev = log_writes_dax_pgoff(ti, &pgoff); diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c index 4112071de0be..b13c43d716f1 100644 --- a/drivers/md/dm-stripe.c +++ b/drivers/md/dm-stripe.c @@ -315,8 +315,8 @@ static struct dax_device *stripe_dax_pgoff(struct dm_target *ti, pgoff_t *pgoff) } static long stripe_dax_direct_access(struct dm_target *ti, pgoff_t pgoff, - long nr_pages, enum dax_access_mode mode, void **kaddr, - pfn_t *pfn) + long nr_pages, enum dax_access_mode mode, + void **kaddr, unsigned long *pfn) { struct dax_device *dax_dev = stripe_dax_pgoff(ti, &pgoff); diff --git a/drivers/md/dm-target.c b/drivers/md/dm-target.c index 652627aea11b..6dfb6d680f2c 100644 --- a/drivers/md/dm-target.c +++ b/drivers/md/dm-target.c @@ -254,8 +254,8 @@ static void io_err_io_hints(struct dm_target *ti, struct queue_limits *limits) } static long io_err_dax_direct_access(struct dm_target *ti, pgoff_t pgoff, - long nr_pages, enum dax_access_mode mode, void **kaddr, - pfn_t *pfn) + long nr_pages, enum dax_access_mode mode, + void **kaddr, unsigned long *pfn) { return -EIO; } diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c index 7ce8847b3404..2c841e30ae92 100644 --- a/drivers/md/dm-writecache.c +++ b/drivers/md/dm-writecache.c @@ -13,7 +13,6 @@ #include <linux/dm-io.h> #include <linux/dm-kcopyd.h> #include <linux/dax.h> -#include <linux/pfn_t.h> #include <linux/libnvdimm.h> #include <linux/delay.h> #include "dm-io-tracker.h" @@ -256,7 +255,7 @@ static int persistent_memory_claim(struct dm_writecache *wc) int r; loff_t s; long p, da; - pfn_t pfn; + unsigned long pfn; int id; struct page **pages; sector_t offset; @@ -290,11 +289,6 @@ static int persistent_memory_claim(struct dm_writecache *wc) r = da; goto err2; } - if (!pfn_t_has_page(pfn)) { - wc->memory_map = NULL; - r = -EOPNOTSUPP; - goto err2; - } if (da != p) { long i; @@ -314,13 +308,9 @@ static int persistent_memory_claim(struct dm_writecache *wc) r = daa ? daa : -EINVAL; goto err3; } - if (!pfn_t_has_page(pfn)) { - r = -EOPNOTSUPP; - goto err3; - } while (daa-- && i < p) { - pages[i++] = pfn_t_to_page(pfn); - pfn.val++; + pages[i++] = pfn_to_page(pfn); + pfn++; if (!(i & 15)) cond_resched(); } diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 87bb90303435..d24324c49433 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1231,8 +1231,8 @@ static struct dm_target *dm_dax_get_live_target(struct mapped_device *md, } static long dm_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, - long nr_pages, enum dax_access_mode mode, void **kaddr, - pfn_t *pfn) + long nr_pages, enum dax_access_mode mode, + void **kaddr, unsigned long *pfn) { struct mapped_device *md = dax_get_private(dax_dev); sector_t sector = pgoff * PAGE_SECTORS; diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index 451cd0fa0c94..d3b3febc8124 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -20,7 +20,6 @@ #include <linux/kstrtox.h> #include <linux/vmalloc.h> #include <linux/blk-mq.h> -#include <linux/pfn_t.h> #include <linux/slab.h> #include <linux/uio.h> #include <linux/dax.h> @@ -242,7 +241,7 @@ static void pmem_submit_bio(struct bio *bio) /* see "strong" declaration in tools/testing/nvdimm/pmem-dax.c */ __weak long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff, long nr_pages, enum dax_access_mode mode, void **kaddr, - pfn_t *pfn) + unsigned long *pfn) { resource_size_t offset = PFN_PHYS(pgoff) + pmem->data_offset; sector_t sector = PFN_PHYS(pgoff) >> SECTOR_SHIFT; @@ -254,7 +253,7 @@ __weak long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff, if (kaddr) *kaddr = pmem->virt_addr + offset; if (pfn) - *pfn = phys_to_pfn_t(pmem->phys_addr + offset, pmem->pfn_flags); + *pfn = PHYS_PFN(pmem->phys_addr + offset); if (bb->count && badblocks_check(bb, sector, num, &first_bad, &num_bad)) { @@ -301,9 +300,9 @@ static int pmem_dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff, PAGE_SIZE)); } -static long pmem_dax_direct_access(struct dax_device *dax_dev, - pgoff_t pgoff, long nr_pages, enum dax_access_mode mode, - void **kaddr, pfn_t *pfn) +static long pmem_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, + long nr_pages, enum dax_access_mode mode, + void **kaddr, unsigned long *pfn) { struct pmem_device *pmem = dax_get_private(dax_dev); @@ -432,7 +431,8 @@ static void pmem_release_disk(void *__pmem) } static int pmem_pagemap_memory_failure(struct dev_pagemap *pgmap, - unsigned long pfn, unsigned long nr_pages, int mf_flags) + unsigned long pfn, + unsigned long nr_pages, int mf_flags) { struct pmem_device *pmem = container_of(pgmap, struct pmem_device, pgmap); @@ -513,7 +513,6 @@ static int pmem_attach_disk(struct device *dev, pmem->disk = disk; pmem->pgmap.owner = pmem; - pmem->pfn_flags = 0; if (is_nd_pfn(dev)) { pmem->pgmap.type = MEMORY_DEVICE_FS_DAX; pmem->pgmap.ops = &fsdax_pagemap_ops; diff --git a/drivers/nvdimm/pmem.h b/drivers/nvdimm/pmem.h index 392b0b38acb9..99ce3ac51fdd 100644 --- a/drivers/nvdimm/pmem.h +++ b/drivers/nvdimm/pmem.h @@ -5,7 +5,6 @@ #include <linux/badblocks.h> #include <linux/memremap.h> #include <linux/types.h> -#include <linux/pfn_t.h> #include <linux/fs.h> enum dax_access_mode; @@ -16,7 +15,6 @@ struct pmem_device { phys_addr_t phys_addr; /* when non-zero this device is hosting a 'pfn' instance */ phys_addr_t data_offset; - u64 pfn_flags; void *virt_addr; /* immutable base size of the namespace */ size_t size; @@ -30,8 +28,8 @@ struct pmem_device { }; long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff, - long nr_pages, enum dax_access_mode mode, void **kaddr, - pfn_t *pfn); + long nr_pages, enum dax_access_mode mode, + void **kaddr, unsigned long *pfn); #ifdef CONFIG_MEMORY_FAILURE static inline bool test_and_clear_pmem_poison(struct page *page) diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c index d1bc79cf56bd..9b537020fe25 100644 --- a/drivers/s390/block/dcssblk.c +++ b/drivers/s390/block/dcssblk.c @@ -17,7 +17,6 @@ #include <linux/blkdev.h> #include <linux/completion.h> #include <linux/interrupt.h> -#include <linux/pfn_t.h> #include <linux/uio.h> #include <linux/dax.h> #include <linux/io.h> @@ -32,8 +31,8 @@ static int dcssblk_open(struct gendisk *disk, blk_mode_t mode); static void dcssblk_release(struct gendisk *disk); static void dcssblk_submit_bio(struct bio *bio); static long dcssblk_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, - long nr_pages, enum dax_access_mode mode, void **kaddr, - pfn_t *pfn); + long nr_pages, enum dax_access_mode mode, + void **kaddr, unsigned long *pfn); static char dcssblk_segments[DCSSBLK_PARM_LEN] = "\0"; @@ -919,9 +918,9 @@ dcssblk_submit_bio(struct bio *bio) bio_io_error(bio); } -static long -__dcssblk_direct_access(struct dcssblk_dev_info *dev_info, pgoff_t pgoff, - long nr_pages, void **kaddr, pfn_t *pfn) +static long __dcssblk_direct_access(struct dcssblk_dev_info *dev_info, + pgoff_t pgoff, long nr_pages, void **kaddr, + unsigned long *pfn) { resource_size_t offset = pgoff * PAGE_SIZE; unsigned long dev_sz; @@ -930,16 +929,14 @@ __dcssblk_direct_access(struct dcssblk_dev_info *dev_info, pgoff_t pgoff, if (kaddr) *kaddr = __va(dev_info->start + offset); if (pfn) - *pfn = __pfn_to_pfn_t(PFN_DOWN(dev_info->start + offset), - PFN_DEV); + *pfn = PFN_DOWN(dev_info->start + offset); return (dev_sz - offset) / PAGE_SIZE; } -static long -dcssblk_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, - long nr_pages, enum dax_access_mode mode, void **kaddr, - pfn_t *pfn) +static long dcssblk_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, + long nr_pages, enum dax_access_mode mode, + void **kaddr, unsigned long *pfn) { struct dcssblk_dev_info *dev_info = dax_get_private(dax_dev); diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c index b84d1747a020..ba7f7ca2aebc 100644 --- a/fs/cramfs/inode.c +++ b/fs/cramfs/inode.c @@ -17,7 +17,6 @@ #include <linux/fs.h> #include <linux/file.h> #include <linux/pagemap.h> -#include <linux/pfn_t.h> #include <linux/ramfs.h> #include <linux/init.h> #include <linux/string.h> @@ -412,7 +411,8 @@ static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma) for (i = 0; i < pages && !ret; i++) { vm_fault_t vmf; unsigned long off = i * PAGE_SIZE; - pfn_t pfn = phys_to_pfn_t(address + off, PFN_DEV); + unsigned long pfn = PHYS_PFN(address + off); + vmf = vmf_insert_mixed(vma, vma->vm_start + off, pfn); if (vmf & VM_FAULT_ERROR) ret = vm_fault_to_errno(vmf, 0); diff --git a/fs/dax.c b/fs/dax.c index 72d6d4586330..fcbe62bde685 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -20,7 +20,6 @@ #include <linux/sched/signal.h> #include <linux/uio.h> #include <linux/vmstat.h> -#include <linux/pfn_t.h> #include <linux/sizes.h> #include <linux/mmu_notifier.h> #include <linux/iomap.h> @@ -76,9 +75,9 @@ static struct folio *dax_to_folio(void *entry) return page_folio(pfn_to_page(dax_to_pfn(entry))); } -static void *dax_make_entry(pfn_t pfn, unsigned long flags) +static void *dax_make_entry(unsigned long pfn, unsigned long flags) { - return xa_mk_value(flags | (pfn_t_to_pfn(pfn) << DAX_SHIFT)); + return xa_mk_value(flags | (pfn << DAX_SHIFT)); } static bool dax_is_locked(void *entry) @@ -612,7 +611,7 @@ static void *grab_mapping_entry(struct xa_state *xas, if (order > 0) flags |= DAX_PMD; - entry = dax_make_entry(pfn_to_pfn_t(0), flags); + entry = dax_make_entry(0, flags); dax_lock_entry(xas, entry); if (xas_error(xas)) goto out_unlock; @@ -837,7 +836,7 @@ static bool dax_fault_is_synchronous(const struct iomap_iter *iter, * appropriate. */ static void *dax_insert_entry(struct xa_state *xas, struct vm_fault *vmf, - const struct iomap_iter *iter, void *entry, pfn_t pfn, + const struct iomap_iter *iter, void *entry, unsigned long pfn, unsigned long flags) { struct address_space *mapping = vmf->vma->vm_file->f_mapping; @@ -1036,7 +1035,8 @@ int dax_writeback_mapping_range(struct address_space *mapping, EXPORT_SYMBOL_GPL(dax_writeback_mapping_range); static int dax_iomap_direct_access(const struct iomap *iomap, loff_t pos, - size_t size, void **kaddr, pfn_t *pfnp) + size_t size, void **kaddr, + unsigned long *pfnp) { pgoff_t pgoff = dax_iomap_pgoff(iomap, pos); int id, rc = 0; @@ -1054,7 +1054,7 @@ static int dax_iomap_direct_access(const struct iomap *iomap, loff_t pos, rc = -EINVAL; if (PFN_PHYS(length) < size) goto out; - if (pfn_t_to_pfn(*pfnp) & (PHYS_PFN(size)-1)) + if (*pfnp & (PHYS_PFN(size)-1)) goto out; rc = 0; @@ -1158,8 +1158,8 @@ static vm_fault_t dax_load_hole(struct xa_state *xas, struct vm_fault *vmf, { struct inode *inode = iter->inode; unsigned long vaddr = vmf->address; - pfn_t pfn = pfn_to_pfn_t(my_zero_pfn(vaddr)); - struct page *page = pfn_t_to_page(pfn); + unsigned long pfn = my_zero_pfn(vaddr); + struct page *page = pfn_to_page(pfn); vm_fault_t ret; *entry = dax_insert_entry(xas, vmf, iter, *entry, pfn, DAX_ZERO_PAGE); @@ -1183,7 +1183,7 @@ static vm_fault_t dax_pmd_load_hole(struct xa_state *xas, struct vm_fault *vmf, struct folio *zero_folio; spinlock_t *ptl; pmd_t pmd_entry; - pfn_t pfn; + unsigned long pfn; if (arch_needs_pgtable_deposit()) { pgtable = pte_alloc_one(vma->vm_mm); @@ -1195,7 +1195,7 @@ static vm_fault_t dax_pmd_load_hole(struct xa_state *xas, struct vm_fault *vmf, if (unlikely(!zero_folio)) goto fallback; - pfn = page_to_pfn_t(&zero_folio->page); + pfn = page_to_pfn(&zero_folio->page); *entry = dax_insert_entry(xas, vmf, iter, *entry, pfn, DAX_PMD | DAX_ZERO_PAGE); @@ -1564,7 +1564,7 @@ static vm_fault_t dax_fault_return(int error) * insertion for now and return the pfn so that caller can insert it after the * fsync is done. */ -static vm_fault_t dax_fault_synchronous_pfnp(pfn_t *pfnp, pfn_t pfn) +static vm_fault_t dax_fault_synchronous_pfnp(unsigned long *pfnp, unsigned long pfn) { if (WARN_ON_ONCE(!pfnp)) return VM_FAULT_SIGBUS; @@ -1612,8 +1612,9 @@ static vm_fault_t dax_fault_cow_page(struct vm_fault *vmf, * @pmd: distinguish whether it is a pmd fault */ static vm_fault_t dax_fault_iter(struct vm_fault *vmf, - const struct iomap_iter *iter, pfn_t *pfnp, - struct xa_state *xas, void **entry, bool pmd) + const struct iomap_iter *iter, + unsigned long *pfnp, struct xa_state *xas, + void **entry, bool pmd) { const struct iomap *iomap = &iter->iomap; const struct iomap *srcmap = iomap_iter_srcmap(iter); @@ -1622,7 +1623,7 @@ static vm_fault_t dax_fault_iter(struct vm_fault *vmf, bool write = iter->flags & IOMAP_WRITE; unsigned long entry_flags = pmd ? DAX_PMD : 0; int ret, err = 0; - pfn_t pfn; + unsigned long pfn; void *kaddr; struct page *page; @@ -1657,7 +1658,7 @@ static vm_fault_t dax_fault_iter(struct vm_fault *vmf, if (dax_fault_is_synchronous(iter, vmf->vma)) return dax_fault_synchronous_pfnp(pfnp, pfn); - page = pfn_t_to_page(pfn); + page = pfn_to_page(pfn); page_ref_inc(page); if (pmd) @@ -1674,8 +1675,9 @@ static vm_fault_t dax_fault_iter(struct vm_fault *vmf, return ret; } -static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, pfn_t *pfnp, - int *iomap_errp, const struct iomap_ops *ops) +static vm_fault_t dax_iomap_pte_fault(struct vm_fault *vmf, unsigned long *pfnp, + int *iomap_errp, + const struct iomap_ops *ops) { struct address_space *mapping = vmf->vma->vm_file->f_mapping; XA_STATE(xas, &mapping->i_pages, vmf->pgoff); @@ -1784,7 +1786,7 @@ static bool dax_fault_check_fallback(struct vm_fault *vmf, struct xa_state *xas, return false; } -static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, +static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, unsigned long *pfnp, const struct iomap_ops *ops) { struct address_space *mapping = vmf->vma->vm_file->f_mapping; @@ -1863,8 +1865,8 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, return ret; } #else -static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, - const struct iomap_ops *ops) +static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, unsigned long *pfnp, + const struct iomap_ops *ops) { return VM_FAULT_FALLBACK; } @@ -1884,7 +1886,8 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, * successfully. */ vm_fault_t dax_iomap_fault(struct vm_fault *vmf, unsigned int order, - pfn_t *pfnp, int *iomap_errp, const struct iomap_ops *ops) + unsigned long *pfnp, int *iomap_errp, + const struct iomap_ops *ops) { if (order == 0) return dax_iomap_pte_fault(vmf, pfnp, iomap_errp, ops); @@ -1905,7 +1908,7 @@ EXPORT_SYMBOL_GPL(dax_iomap_fault); * for an mmaped DAX file. It also marks the page cache entry as dirty. */ static vm_fault_t -dax_insert_pfn_mkwrite(struct vm_fault *vmf, pfn_t pfn, unsigned int order) +dax_insert_pfn_mkwrite(struct vm_fault *vmf, unsigned long pfn, unsigned int order) { struct address_space *mapping = vmf->vma->vm_file->f_mapping; XA_STATE_ORDER(xas, &mapping->i_pages, vmf->pgoff, order); @@ -1927,7 +1930,7 @@ dax_insert_pfn_mkwrite(struct vm_fault *vmf, pfn_t pfn, unsigned int order) xas_set_mark(&xas, PAGECACHE_TAG_DIRTY); dax_lock_entry(&xas, entry); xas_unlock_irq(&xas); - page = pfn_t_to_page(pfn); + page = pfn_to_page(pfn); page_ref_inc(page); if (order == 0) ret = dax_insert_pfn(vmf, pfn, true); @@ -1954,7 +1957,7 @@ dax_insert_pfn_mkwrite(struct vm_fault *vmf, pfn_t pfn, unsigned int order) * table entry. */ vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf, unsigned int order, - pfn_t pfn) + unsigned long pfn) { int err; loff_t start = ((loff_t)vmf->pgoff) << PAGE_SHIFT; diff --git a/fs/ext4/file.c b/fs/ext4/file.c index c89e434db6b7..13e939bcc7ac 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -722,7 +722,7 @@ static vm_fault_t ext4_dax_huge_fault(struct vm_fault *vmf, unsigned int order) bool write = (vmf->flags & FAULT_FLAG_WRITE) && (vmf->vma->vm_flags & VM_SHARED); struct address_space *mapping = vmf->vma->vm_file->f_mapping; - pfn_t pfn; + unsigned long pfn; if (write) { sb_start_pagefault(sb); diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c index da505956208f..0b6b440520da 100644 --- a/fs/fuse/dax.c +++ b/fs/fuse/dax.c @@ -10,7 +10,6 @@ #include <linux/dax.h> #include <linux/uio.h> #include <linux/pagemap.h> -#include <linux/pfn_t.h> #include <linux/iomap.h> #include <linux/interval_tree.h> @@ -788,7 +787,7 @@ static vm_fault_t __fuse_dax_fault(struct vm_fault *vmf, unsigned int order, vm_fault_t ret; struct inode *inode = file_inode(vmf->vma->vm_file); struct super_block *sb = inode->i_sb; - pfn_t pfn; + unsigned long pfn; int error = 0; struct fuse_conn *fc = get_fuse_conn(inode); struct fuse_conn_dax *fcd = fc->dax; diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c index f79a94d148da..e49e2ae33206 100644 --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -9,7 +9,6 @@ #include <linux/pci.h> #include <linux/interrupt.h> #include <linux/group_cpus.h> -#include <linux/pfn_t.h> #include <linux/memremap.h> #include <linux/module.h> #include <linux/virtio.h> @@ -866,7 +865,7 @@ static void virtio_fs_cleanup_vqs(struct virtio_device *vdev) */ static long virtio_fs_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages, enum dax_access_mode mode, - void **kaddr, pfn_t *pfn) + void **kaddr, unsigned long *pfn) { struct virtio_fs *fs = dax_get_private(dax_dev); phys_addr_t offset = PFN_PHYS(pgoff); @@ -875,7 +874,7 @@ static long virtio_fs_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, if (kaddr) *kaddr = fs->window_kaddr + offset; if (pfn) - *pfn = phys_to_pfn_t(fs->window_phys_addr + offset, 0); + *pfn = PHYS_PFN(fs->window_phys_addr + offset); return nr_pages > max_nr_pages ? max_nr_pages : nr_pages; } diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 4cdc54dc9686..47edb2785ad2 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1243,7 +1243,7 @@ xfs_dax_fault_locked( bool write_fault) { vm_fault_t ret; - pfn_t pfn; + unsigned long pfn; if (!IS_ENABLED(CONFIG_FS_DAX)) { ASSERT(0); diff --git a/include/linux/dax.h b/include/linux/dax.h index 0f6f355ec3b5..153dd2398178 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -26,7 +26,7 @@ struct dax_operations { * number of pages available for DAX at that pfn. */ long (*direct_access)(struct dax_device *, pgoff_t, long, - enum dax_access_mode, void **, pfn_t *); + enum dax_access_mode, void **, unsigned long *); /* * Validate whether this device is usable as an fsdax backing * device. @@ -241,7 +241,8 @@ static inline void dax_read_unlock(int id) bool dax_alive(struct dax_device *dax_dev); void *dax_get_private(struct dax_device *dax_dev); long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages, - enum dax_access_mode mode, void **kaddr, pfn_t *pfn); + enum dax_access_mode mode, void **kaddr, + unsigned long *pfn); size_t dax_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr, size_t bytes, struct iov_iter *i); size_t dax_copy_to_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr, @@ -255,9 +256,10 @@ void dax_flush(struct dax_device *dax_dev, void *addr, size_t size); ssize_t dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter, const struct iomap_ops *ops); vm_fault_t dax_iomap_fault(struct vm_fault *vmf, unsigned int order, - pfn_t *pfnp, int *errp, const struct iomap_ops *ops); -vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf, - unsigned int order, pfn_t pfn); + unsigned long *pfnp, int *errp, + const struct iomap_ops *ops); +vm_fault_t dax_finish_sync_fault(struct vm_fault *vmf, unsigned int order, + unsigned long pfn); int dax_delete_mapping_entry(struct address_space *mapping, pgoff_t index); int dax_invalidate_mapping_entry_sync(struct address_space *mapping, pgoff_t index); diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h index 53ca3a913d06..05fadca5b588 100644 --- a/include/linux/device-mapper.h +++ b/include/linux/device-mapper.h @@ -147,9 +147,10 @@ typedef int (*dm_busy_fn) (struct dm_target *ti); * < 0 : error * >= 0 : the number of bytes accessible at the address */ -typedef long (*dm_dax_direct_access_fn) (struct dm_target *ti, pgoff_t pgoff, - long nr_pages, enum dax_access_mode node, void **kaddr, - pfn_t *pfn); +typedef long (*dm_dax_direct_access_fn)(struct dm_target *ti, pgoff_t pgoff, + long nr_pages, + enum dax_access_mode node, void **kaddr, + unsigned long *pfn); typedef int (*dm_dax_zero_page_range_fn)(struct dm_target *ti, pgoff_t pgoff, size_t nr_pages); diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 79a24ac31080..a047379d94ad 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -38,10 +38,10 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, pgprot_t newprot, unsigned long cp_flags); -vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write); -vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write); -vm_fault_t dax_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write); -vm_fault_t dax_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write); +vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, unsigned long pfn, bool write); +vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, unsigned long pfn, bool write); +vm_fault_t dax_insert_pfn_pmd(struct vm_fault *vmf, unsigned long pfn, bool write); +vm_fault_t dax_insert_pfn_pud(struct vm_fault *vmf, unsigned long pfn, bool write); enum transparent_hugepage_flag { TRANSPARENT_HUGEPAGE_UNSUPPORTED, diff --git a/include/linux/mm.h b/include/linux/mm.h index d9517e109ac3..41a419c549ef 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3437,14 +3437,15 @@ int vm_map_pages(struct vm_area_struct *vma, struct page **pages, unsigned long num); int vm_map_pages_zero(struct vm_area_struct *vma, struct page **pages, unsigned long num); -vm_fault_t dax_insert_pfn(struct vm_fault *vmf, pfn_t pfn_t, bool write); +vm_fault_t dax_insert_pfn(struct vm_fault *vmf, unsigned long pfn, bool write); vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn); vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn, pgprot_t pgprot); vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr, - pfn_t pfn); -int vm_iomap_memory(struct vm_area_struct *vma, phys_addr_t start, unsigned long len); + unsigned long pfn); +int vm_iomap_memory(struct vm_area_struct *vma, phys_addr_t start, + unsigned long len); static inline vm_fault_t vmf_insert_page(struct vm_area_struct *vma, unsigned long addr, struct page *page) diff --git a/include/linux/pfn.h b/include/linux/pfn.h index 14bc053c53d8..482cf9a07fda 100644 --- a/include/linux/pfn.h +++ b/include/linux/pfn.h @@ -2,19 +2,6 @@ #ifndef _LINUX_PFN_H_ #define _LINUX_PFN_H_ -#ifndef __ASSEMBLY__ -#include <linux/types.h> - -/* - * pfn_t: encapsulates a page-frame number that is optionally backed - * by memmap (struct page). Whether a pfn_t has a 'struct page' - * backing is indicated by flags in the high bits of the value. - */ -typedef struct { - u64 val; -} pfn_t; -#endif - #define PFN_ALIGN(x) (((unsigned long)(x) + (PAGE_SIZE - 1)) & PAGE_MASK) #define PFN_UP(x) (((x) + PAGE_SIZE-1) >> PAGE_SHIFT) #define PFN_DOWN(x) ((x) >> PAGE_SHIFT) diff --git a/include/linux/pfn_t.h b/include/linux/pfn_t.h deleted file mode 100644 index 76e519b20553..000000000000 --- a/include/linux/pfn_t.h +++ /dev/null @@ -1,96 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifndef _LINUX_PFN_T_H_ -#define _LINUX_PFN_T_H_ -#include <linux/mm.h> - -/* - * PFN_FLAGS_MASK - mask of all the possible valid pfn_t flags - * PFN_SG_CHAIN - pfn is a pointer to the next scatterlist entry - * PFN_SG_LAST - pfn references a page and is the last scatterlist entry - * PFN_DEV - pfn is not covered by system memmap by default - * PFN_MAP - pfn has a dynamic page mapping established by a device driver - */ -#define PFN_FLAGS_MASK (((u64) (~PAGE_MASK)) << (BITS_PER_LONG_LONG - PAGE_SHIFT)) -#define PFN_SG_CHAIN (1ULL << (BITS_PER_LONG_LONG - 1)) -#define PFN_SG_LAST (1ULL << (BITS_PER_LONG_LONG - 2)) -#define PFN_DEV (1ULL << (BITS_PER_LONG_LONG - 3)) -#define PFN_MAP (1ULL << (BITS_PER_LONG_LONG - 4)) - -#define PFN_FLAGS_TRACE \ - { PFN_SG_CHAIN, "SG_CHAIN" }, \ - { PFN_SG_LAST, "SG_LAST" }, \ - { PFN_DEV, "DEV" }, \ - { PFN_MAP, "MAP" } - -static inline pfn_t __pfn_to_pfn_t(unsigned long pfn, u64 flags) -{ - pfn_t pfn_t = { .val = pfn | (flags & PFN_FLAGS_MASK), }; - - return pfn_t; -} - -/* a default pfn to pfn_t conversion assumes that @pfn is pfn_valid() */ -static inline pfn_t pfn_to_pfn_t(unsigned long pfn) -{ - return __pfn_to_pfn_t(pfn, 0); -} - -static inline pfn_t phys_to_pfn_t(phys_addr_t addr, u64 flags) -{ - return __pfn_to_pfn_t(addr >> PAGE_SHIFT, flags); -} - -static inline bool pfn_t_has_page(pfn_t pfn) -{ - return (pfn.val & PFN_MAP) == PFN_MAP || (pfn.val & PFN_DEV) == 0; -} - -static inline unsigned long pfn_t_to_pfn(pfn_t pfn) -{ - return pfn.val & ~PFN_FLAGS_MASK; -} - -static inline struct page *pfn_t_to_page(pfn_t pfn) -{ - if (pfn_t_has_page(pfn)) - return pfn_to_page(pfn_t_to_pfn(pfn)); - return NULL; -} - -static inline phys_addr_t pfn_t_to_phys(pfn_t pfn) -{ - return PFN_PHYS(pfn_t_to_pfn(pfn)); -} - -static inline pfn_t page_to_pfn_t(struct page *page) -{ - return pfn_to_pfn_t(page_to_pfn(page)); -} - -static inline int pfn_t_valid(pfn_t pfn) -{ - return pfn_valid(pfn_t_to_pfn(pfn)); -} - -#ifdef CONFIG_MMU -static inline pte_t pfn_t_pte(pfn_t pfn, pgprot_t pgprot) -{ - return pfn_pte(pfn_t_to_pfn(pfn), pgprot); -} -#endif - -#ifdef CONFIG_TRANSPARENT_HUGEPAGE -static inline pmd_t pfn_t_pmd(pfn_t pfn, pgprot_t pgprot) -{ - return pfn_pmd(pfn_t_to_pfn(pfn), pgprot); -} - -#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD -static inline pud_t pfn_t_pud(pfn_t pfn, pgprot_t pgprot) -{ - return pfn_pud(pfn_t_to_pfn(pfn), pgprot); -} -#endif -#endif - -#endif /* _LINUX_PFN_T_H_ */ diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index f3a95e38872c..d51e87e1adae 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1513,7 +1513,7 @@ static inline int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot, * by vmf_insert_pfn(). */ static inline void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot, - pfn_t pfn) + unsigned long pfn) { } @@ -1549,7 +1549,7 @@ extern int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot, unsigned long pfn, unsigned long addr, unsigned long size); extern void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot, - pfn_t pfn); + unsigned long pfn); extern int track_pfn_copy(struct vm_area_struct *vma); extern void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn, unsigned long size, bool mm_wr_locked); diff --git a/include/trace/events/fs_dax.h b/include/trace/events/fs_dax.h index 86fe6aecff1e..10f706e37040 100644 --- a/include/trace/events/fs_dax.h +++ b/include/trace/events/fs_dax.h @@ -104,14 +104,14 @@ DEFINE_PMD_LOAD_HOLE_EVENT(dax_pmd_load_hole_fallback); DECLARE_EVENT_CLASS(dax_pmd_insert_mapping_class, TP_PROTO(struct inode *inode, struct vm_fault *vmf, - long length, pfn_t pfn, void *radix_entry), + long length, unsigned long pfn, void *radix_entry), TP_ARGS(inode, vmf, length, pfn, radix_entry), TP_STRUCT__entry( __field(unsigned long, ino) __field(unsigned long, vm_flags) __field(unsigned long, address) __field(long, length) - __field(u64, pfn_val) + __field(unsigned long, pfn) __field(void *, radix_entry) __field(dev_t, dev) __field(int, write) @@ -123,11 +123,11 @@ DECLARE_EVENT_CLASS(dax_pmd_insert_mapping_class, __entry->address = vmf->address; __entry->write = vmf->flags & FAULT_FLAG_WRITE; __entry->length = length; - __entry->pfn_val = pfn.val; + __entry->pfn = pfn; __entry->radix_entry = radix_entry; ), TP_printk("dev %d:%d ino %#lx %s %s address %#lx length %#lx " - "pfn %#llx %s radix_entry %#lx", + "pfn %#lx radix_entry %#lx", MAJOR(__entry->dev), MINOR(__entry->dev), __entry->ino, @@ -135,9 +135,7 @@ DECLARE_EVENT_CLASS(dax_pmd_insert_mapping_class, __entry->write ? "write" : "read", __entry->address, __entry->length, - __entry->pfn_val & ~PFN_FLAGS_MASK, - __print_flags_u64(__entry->pfn_val & PFN_FLAGS_MASK, "|", - PFN_FLAGS_TRACE), + __entry->pfn, (unsigned long)__entry->radix_entry ) ) @@ -145,7 +143,7 @@ DECLARE_EVENT_CLASS(dax_pmd_insert_mapping_class, #define DEFINE_PMD_INSERT_MAPPING_EVENT(name) \ DEFINE_EVENT(dax_pmd_insert_mapping_class, name, \ TP_PROTO(struct inode *inode, struct vm_fault *vmf, \ - long length, pfn_t pfn, void *radix_entry), \ + long length, unsigned long pfn, void *radix_entry), \ TP_ARGS(inode, vmf, length, pfn, radix_entry)) DEFINE_PMD_INSERT_MAPPING_EVENT(dax_pmd_insert_mapping); diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index 1262148d97b7..ec8e8d746658 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -20,7 +20,6 @@ #include <linux/mman.h> #include <linux/mm_types.h> #include <linux/module.h> -#include <linux/pfn_t.h> #include <linux/printk.h> #include <linux/pgtable.h> #include <linux/random.h> diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 7c39950bfdae..ea65c2db2bb1 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -23,7 +23,6 @@ #include <linux/mm_types.h> #include <linux/khugepaged.h> #include <linux/freezer.h> -#include <linux/pfn_t.h> #include <linux/mman.h> #include <linux/memremap.h> #include <linux/pagemap.h> @@ -1232,15 +1231,15 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf) } static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, - pmd_t *pmd, pfn_t pfn, pgprot_t prot, bool write, - pgtable_t pgtable) + pmd_t *pmd, unsigned long pfn, pgprot_t prot, + bool write, pgtable_t pgtable) { struct mm_struct *mm = vma->vm_mm; pmd_t entry; if (!pmd_none(*pmd)) { if (write) { - if (pmd_pfn(*pmd) != pfn_t_to_pfn(pfn)) { + if (pmd_pfn(*pmd) != pfn) { WARN_ON_ONCE(!is_huge_zero_pmd(*pmd)); return; } @@ -1253,7 +1252,7 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, return; } - entry = pmd_mkhuge(pfn_t_pmd(pfn, prot)); + entry = pmd_mkhuge(pfn_pmd(pfn, prot)); if (write) { entry = pmd_mkyoung(pmd_mkdirty(entry)); entry = maybe_pmd_mkwrite(entry, vma); @@ -1279,7 +1278,7 @@ static void insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr, * * Return: vm_fault_t value. */ -vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write) +vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, unsigned long pfn, bool write) { unsigned long addr = vmf->address & PMD_MASK; struct vm_area_struct *vma = vmf->vma; @@ -1316,7 +1315,7 @@ vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write) } EXPORT_SYMBOL_GPL(vmf_insert_pfn_pmd); -vm_fault_t dax_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write) +vm_fault_t dax_insert_pfn_pmd(struct vm_fault *vmf, unsigned long pfn, bool write) { struct vm_area_struct *vma = vmf->vma; unsigned long addr = vmf->address & PMD_MASK; @@ -1339,7 +1338,7 @@ vm_fault_t dax_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write) ptl = pmd_lock(mm, vmf->pmd); if (pmd_none(*vmf->pmd)) { - page = pfn_t_to_page(pfn); + page = pfn_to_page(pfn); folio = page_folio(page); folio_get(folio); folio_add_file_rmap_pmd(folio, page, vma); @@ -1364,7 +1363,7 @@ static pud_t maybe_pud_mkwrite(pud_t pud, struct vm_area_struct *vma) } static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr, - pud_t *pud, pfn_t pfn, bool write) + pud_t *pud, unsigned long pfn, bool write) { struct mm_struct *mm = vma->vm_mm; pgprot_t prot = vma->vm_page_prot; @@ -1372,7 +1371,7 @@ static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr, if (!pud_none(*pud)) { if (write) { - if (pud_pfn(*pud) != pfn_t_to_pfn(pfn)) { + if (pud_pfn(*pud) != pfn) { WARN_ON_ONCE(!is_huge_zero_pud(*pud)); return; } @@ -1384,7 +1383,7 @@ static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr, return; } - entry = pud_mkhuge(pfn_t_pud(pfn, prot)); + entry = pud_mkhuge(pfn_pud(pfn, prot)); if (write) { entry = pud_mkyoung(pud_mkdirty(entry)); entry = maybe_pud_mkwrite(entry, vma); @@ -1403,7 +1402,7 @@ static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr, * * Return: vm_fault_t value. */ -vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write) +vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, unsigned long pfn, bool write) { unsigned long addr = vmf->address & PUD_MASK; struct vm_area_struct *vma = vmf->vma; @@ -1440,7 +1439,7 @@ EXPORT_SYMBOL_GPL(vmf_insert_pfn_pud); * * Return: vm_fault_t value. */ -vm_fault_t dax_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write) +vm_fault_t dax_insert_pfn_pud(struct vm_fault *vmf, unsigned long pfn, bool write) { struct vm_area_struct *vma = vmf->vma; unsigned long addr = vmf->address & PUD_MASK; @@ -1458,7 +1457,7 @@ vm_fault_t dax_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write) ptl = pud_lock(mm, pud); if (pud_none(*vmf->pud)) { - page = pfn_t_to_page(pfn); + page = pfn_to_page(pfn); folio = page_folio(page); folio_get(folio); folio_add_file_rmap_pud(folio, page, vma); diff --git a/mm/memory.c b/mm/memory.c index 721aac02a636..ed75f561d445 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -59,7 +59,6 @@ #include <linux/export.h> #include <linux/delayacct.h> #include <linux/init.h> -#include <linux/pfn_t.h> #include <linux/writeback.h> #include <linux/memcontrol.h> #include <linux/mmu_notifier.h> @@ -2327,7 +2326,7 @@ int vm_map_pages_zero(struct vm_area_struct *vma, struct page **pages, EXPORT_SYMBOL(vm_map_pages_zero); static vm_fault_t insert_pfn(struct vm_area_struct *vma, unsigned long addr, - pfn_t pfn, pgprot_t prot) + unsigned long pfn, pgprot_t prot) { struct mm_struct *mm = vma->vm_mm; pte_t *pte, entry; @@ -2341,7 +2340,7 @@ static vm_fault_t insert_pfn(struct vm_area_struct *vma, unsigned long addr, goto out_unlock; /* Ok, finally just insert the thing.. */ - entry = pte_mkspecial(pfn_t_pte(pfn, prot)); + entry = pte_mkspecial(pfn_pte(pfn, prot)); set_pte_at(mm, addr, pte, entry); update_mmu_cache(vma, addr, pte); /* XXX: why not for insert_page? */ @@ -2385,7 +2384,7 @@ static vm_fault_t insert_pfn(struct vm_area_struct *vma, unsigned long addr, * Return: vm_fault_t value. */ vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, - unsigned long pfn, pgprot_t pgprot) + unsigned long pfn, pgprot_t pgprot) { /* * Technically, architectures with pte_special can avoid all these @@ -2405,9 +2404,9 @@ vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, if (!pfn_modify_allowed(pfn, pgprot)) return VM_FAULT_SIGBUS; - track_pfn_insert(vma, &pgprot, __pfn_to_pfn_t(pfn, PFN_DEV)); + track_pfn_insert(vma, &pgprot, pfn); - return insert_pfn(vma, addr, __pfn_to_pfn_t(pfn, PFN_DEV), pgprot); + return insert_pfn(vma, addr, pfn, pgprot); } EXPORT_SYMBOL(vmf_insert_pfn_prot); @@ -2438,21 +2437,20 @@ vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr, } EXPORT_SYMBOL(vmf_insert_pfn); -static bool vm_mixed_ok(struct vm_area_struct *vma, pfn_t pfn) +static bool vm_mixed_ok(struct vm_area_struct *vma, unsigned long pfn) { - if (unlikely(is_zero_pfn(pfn_t_to_pfn(pfn))) && - !vm_mixed_zeropage_allowed(vma)) + if (unlikely(is_zero_pfn(pfn)) && !vm_mixed_zeropage_allowed(vma)) return false; /* these checks mirror the abort conditions in vm_normal_page */ if (vma->vm_flags & VM_MIXEDMAP) return true; - if (is_zero_pfn(pfn_t_to_pfn(pfn))) + if (is_zero_pfn(pfn)) return true; return false; } vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr, - pfn_t pfn) + unsigned long pfn) { pgprot_t pgprot = vma->vm_page_prot; int err; @@ -2465,7 +2463,7 @@ vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr, track_pfn_insert(vma, &pgprot, pfn); - if (!pfn_modify_allowed(pfn_t_to_pfn(pfn), pgprot)) + if (!pfn_modify_allowed(pfn, pgprot)) return VM_FAULT_SIGBUS; /* @@ -2475,15 +2473,10 @@ vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr, * than insert_pfn). If a zero_pfn were inserted into a VM_MIXEDMAP * without pte special, it would there be refcounted as a normal page. */ - if (!IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL) && pfn_t_valid(pfn)) { + if (!IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL) && pfn_valid(pfn)) { struct page *page; - /* - * At this point we are committed to insert_page() - * regardless of whether the caller specified flags that - * result in pfn_t_has_page() == false. - */ - page = pfn_to_page(pfn_t_to_pfn(pfn)); + page = pfn_to_page(pfn); err = insert_page(vma, addr, page, pgprot, false); } else { return insert_pfn(vma, addr, pfn, pgprot); @@ -2498,11 +2491,10 @@ vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr, } EXPORT_SYMBOL(vmf_insert_mixed); -vm_fault_t dax_insert_pfn(struct vm_fault *vmf, pfn_t pfn_t, bool write) +vm_fault_t dax_insert_pfn(struct vm_fault *vmf, unsigned long pfn, bool write) { struct vm_area_struct *vma = vmf->vma; pgprot_t pgprot = vma->vm_page_prot; - unsigned long pfn = pfn_t_to_pfn(pfn_t); struct page *page = pfn_to_page(pfn); unsigned long addr = vmf->address; int err; @@ -2510,7 +2502,7 @@ vm_fault_t dax_insert_pfn(struct vm_fault *vmf, pfn_t pfn_t, bool write) if (addr < vma->vm_start || addr >= vma->vm_end) return VM_FAULT_SIGBUS; - track_pfn_insert(vma, &pgprot, pfn_t); + track_pfn_insert(vma, &pgprot, pfn); if (!pfn_modify_allowed(pfn, pgprot)) return VM_FAULT_SIGBUS; @@ -2518,7 +2510,7 @@ vm_fault_t dax_insert_pfn(struct vm_fault *vmf, pfn_t pfn_t, bool write) /* * We refcount the page normally so make sure pfn_valid is true. */ - if (!pfn_t_valid(pfn_t)) + if (!pfn_valid(pfn)) return VM_FAULT_SIGBUS; if (WARN_ON(is_zero_pfn(pfn) && write)) diff --git a/mm/memremap.c b/mm/memremap.c index 30bb99301b18..2b92195638db 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -5,7 +5,6 @@ #include <linux/kasan.h> #include <linux/memory_hotplug.h> #include <linux/memremap.h> -#include <linux/pfn_t.h> #include <linux/swap.h> #include <linux/mm.h> #include <linux/mmzone.h> diff --git a/mm/migrate.c b/mm/migrate.c index ba4893d42618..18d19ef24311 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -37,7 +37,6 @@ #include <linux/hugetlb.h> #include <linux/hugetlb_cgroup.h> #include <linux/gfp.h> -#include <linux/pfn_t.h> #include <linux/memremap.h> #include <linux/userfaultfd_k.h> #include <linux/balloon_compaction.h> diff --git a/tools/testing/nvdimm/pmem-dax.c b/tools/testing/nvdimm/pmem-dax.c index c1ec099a3b1d..f5ef3d034db5 100644 --- a/tools/testing/nvdimm/pmem-dax.c +++ b/tools/testing/nvdimm/pmem-dax.c @@ -9,8 +9,8 @@ #include <nd.h> long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff, - long nr_pages, enum dax_access_mode mode, void **kaddr, - pfn_t *pfn) + long nr_pages, enum dax_access_mode mode, + void **kaddr, unsigned long *pfn) { resource_size_t offset = PFN_PHYS(pgoff) + pmem->data_offset; @@ -29,7 +29,7 @@ long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff, *kaddr = pmem->virt_addr + offset; page = vmalloc_to_page(pmem->virt_addr + offset); if (pfn) - *pfn = page_to_pfn_t(page); + *pfn = page_to_pfn(page); pr_debug_ratelimited("%s: pmem: %p pgoff: %#lx pfn: %#lx\n", __func__, pmem, pgoff, page_to_pfn(page)); @@ -39,7 +39,7 @@ long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff, if (kaddr) *kaddr = pmem->virt_addr + offset; if (pfn) - *pfn = phys_to_pfn_t(pmem->phys_addr + offset, pmem->pfn_flags); + *pfn = PHYS_PFN(pmem->phys_addr + offset); /* * If badblocks are present, limit known good range to the diff --git a/tools/testing/nvdimm/test/iomap.c b/tools/testing/nvdimm/test/iomap.c index e4313726fae3..f7e7bfe9bb85 100644 --- a/tools/testing/nvdimm/test/iomap.c +++ b/tools/testing/nvdimm/test/iomap.c @@ -8,7 +8,6 @@ #include <linux/ioport.h> #include <linux/module.h> #include <linux/types.h> -#include <linux/pfn_t.h> #include <linux/acpi.h> #include <linux/io.h> #include <linux/mm.h> @@ -135,16 +134,6 @@ void *__wrap_devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) } EXPORT_SYMBOL_GPL(__wrap_devm_memremap_pages); -pfn_t __wrap_phys_to_pfn_t(phys_addr_t addr, unsigned long flags) -{ - struct nfit_test_resource *nfit_res = get_nfit_res(addr); - - if (nfit_res) - flags &= ~PFN_MAP; - return phys_to_pfn_t(addr, flags); -} -EXPORT_SYMBOL(__wrap_phys_to_pfn_t); - void *__wrap_memremap(resource_size_t offset, size_t size, unsigned long flags) { diff --git a/tools/testing/nvdimm/test/nfit_test.h b/tools/testing/nvdimm/test/nfit_test.h index b00583d1eace..b9047fb8ea4a 100644 --- a/tools/testing/nvdimm/test/nfit_test.h +++ b/tools/testing/nvdimm/test/nfit_test.h @@ -212,7 +212,6 @@ void __iomem *__wrap_devm_ioremap(struct device *dev, void *__wrap_devm_memremap(struct device *dev, resource_size_t offset, size_t size, unsigned long flags); void *__wrap_devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap); -pfn_t __wrap_phys_to_pfn_t(phys_addr_t addr, unsigned long flags); void *__wrap_memremap(resource_size_t offset, size_t size, unsigned long flags); void __wrap_devm_memunmap(struct device *dev, void *addr);
Hi Dan,
kernel test robot noticed the following build warnings:
[auto build test WARNING on s390/features]
[also build test WARNING on brauner-vfs/vfs.all akpm-mm/mm-everything linus/master v6.11 next-20240924]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Dan-Williams/dcssblk-Mark-DAX-broken/20240925-070047
base: https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git features
patch link: https://lore.kernel.org/r/172721874675.497781.3277495908107141898.stgit%40dwillia2-xfh.jf.intel.com
patch subject: [PATCH] dcssblk: Mark DAX broken
config: s390-allyesconfig (https://download.01.org/0day-ci/archive/20240925/202409251251.i8yVl4yR-lkp@intel.com/config)
compiler: s390-linux-gcc (GCC) 14.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240925/202409251251.i8yVl4yR-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409251251.i8yVl4yR-lkp@intel.com/
All warnings (new ones prefixed by >>):
drivers/s390/block/dcssblk.c: In function 'dcssblk_add_store':
>> drivers/s390/block/dcssblk.c:571:28: warning: unused variable 'dax_dev' [-Wunused-variable]
571 | struct dax_device *dax_dev;
| ^~~~~~~
vim +/dax_dev +571 drivers/s390/block/dcssblk.c
e265834f5da2c4 Dan Williams 2024-09-24 557
^1da177e4c3f41 Linus Torvalds 2005-04-16 558 /*
^1da177e4c3f41 Linus Torvalds 2005-04-16 559 * device attribute for adding devices
^1da177e4c3f41 Linus Torvalds 2005-04-16 560 */
^1da177e4c3f41 Linus Torvalds 2005-04-16 561 static ssize_t
e404e274f62665 Yani Ioannou 2005-05-17 562 dcssblk_add_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count)
^1da177e4c3f41 Linus Torvalds 2005-04-16 563 {
af190c53c995bf Christoph Hellwig 2024-02-15 564 struct queue_limits lim = {
af190c53c995bf Christoph Hellwig 2024-02-15 565 .logical_block_size = 4096,
f467fee48da450 Christoph Hellwig 2024-06-17 566 .features = BLK_FEAT_DAX,
af190c53c995bf Christoph Hellwig 2024-02-15 567 };
b2300b9efe1b81 Hongjie Yang 2008-10-10 568 int rc, i, j, num_of_segments;
^1da177e4c3f41 Linus Torvalds 2005-04-16 569 struct dcssblk_dev_info *dev_info;
b2300b9efe1b81 Hongjie Yang 2008-10-10 570 struct segment_info *seg_info, *temp;
cf7fe690abbbe5 Mathieu Desnoyers 2024-02-15 @571 struct dax_device *dax_dev;
^1da177e4c3f41 Linus Torvalds 2005-04-16 572 char *local_buf;
^1da177e4c3f41 Linus Torvalds 2005-04-16 573 unsigned long seg_byte_size;
^1da177e4c3f41 Linus Torvalds 2005-04-16 574
^1da177e4c3f41 Linus Torvalds 2005-04-16 575 dev_info = NULL;
b2300b9efe1b81 Hongjie Yang 2008-10-10 576 seg_info = NULL;
^1da177e4c3f41 Linus Torvalds 2005-04-16 577 if (dev != dcssblk_root_dev) {
^1da177e4c3f41 Linus Torvalds 2005-04-16 578 rc = -EINVAL;
^1da177e4c3f41 Linus Torvalds 2005-04-16 579 goto out_nobuf;
^1da177e4c3f41 Linus Torvalds 2005-04-16 580 }
b2300b9efe1b81 Hongjie Yang 2008-10-10 581 if ((count < 1) || (buf[0] == '\0') || (buf[0] == '\n')) {
b2300b9efe1b81 Hongjie Yang 2008-10-10 582 rc = -ENAMETOOLONG;
b2300b9efe1b81 Hongjie Yang 2008-10-10 583 goto out_nobuf;
b2300b9efe1b81 Hongjie Yang 2008-10-10 584 }
b2300b9efe1b81 Hongjie Yang 2008-10-10 585
^1da177e4c3f41 Linus Torvalds 2005-04-16 586 local_buf = kmalloc(count + 1, GFP_KERNEL);
^1da177e4c3f41 Linus Torvalds 2005-04-16 587 if (local_buf == NULL) {
^1da177e4c3f41 Linus Torvalds 2005-04-16 588 rc = -ENOMEM;
^1da177e4c3f41 Linus Torvalds 2005-04-16 589 goto out_nobuf;
^1da177e4c3f41 Linus Torvalds 2005-04-16 590 }
b2300b9efe1b81 Hongjie Yang 2008-10-10 591
^1da177e4c3f41 Linus Torvalds 2005-04-16 592 /*
^1da177e4c3f41 Linus Torvalds 2005-04-16 593 * parse input
^1da177e4c3f41 Linus Torvalds 2005-04-16 594 */
b2300b9efe1b81 Hongjie Yang 2008-10-10 595 num_of_segments = 0;
3a9f9183bdd341 Ameen Ali 2015-02-24 596 for (i = 0; (i < count && (buf[i] != '\0') && (buf[i] != '\n')); i++) {
42cfc6b590c5eb Martin Schwidefsky 2015-08-19 597 for (j = i; j < count &&
42cfc6b590c5eb Martin Schwidefsky 2015-08-19 598 (buf[j] != ':') &&
b2300b9efe1b81 Hongjie Yang 2008-10-10 599 (buf[j] != '\0') &&
42cfc6b590c5eb Martin Schwidefsky 2015-08-19 600 (buf[j] != '\n'); j++) {
b2300b9efe1b81 Hongjie Yang 2008-10-10 601 local_buf[j-i] = toupper(buf[j]);
b2300b9efe1b81 Hongjie Yang 2008-10-10 602 }
b2300b9efe1b81 Hongjie Yang 2008-10-10 603 local_buf[j-i] = '\0';
b2300b9efe1b81 Hongjie Yang 2008-10-10 604 if (((j - i) == 0) || ((j - i) > 8)) {
^1da177e4c3f41 Linus Torvalds 2005-04-16 605 rc = -ENAMETOOLONG;
b2300b9efe1b81 Hongjie Yang 2008-10-10 606 goto seg_list_del;
^1da177e4c3f41 Linus Torvalds 2005-04-16 607 }
b2300b9efe1b81 Hongjie Yang 2008-10-10 608
b2300b9efe1b81 Hongjie Yang 2008-10-10 609 rc = dcssblk_load_segment(local_buf, &seg_info);
b2300b9efe1b81 Hongjie Yang 2008-10-10 610 if (rc < 0)
b2300b9efe1b81 Hongjie Yang 2008-10-10 611 goto seg_list_del;
^1da177e4c3f41 Linus Torvalds 2005-04-16 612 /*
^1da177e4c3f41 Linus Torvalds 2005-04-16 613 * get a struct dcssblk_dev_info
^1da177e4c3f41 Linus Torvalds 2005-04-16 614 */
b2300b9efe1b81 Hongjie Yang 2008-10-10 615 if (num_of_segments == 0) {
b2300b9efe1b81 Hongjie Yang 2008-10-10 616 dev_info = kzalloc(sizeof(struct dcssblk_dev_info),
b2300b9efe1b81 Hongjie Yang 2008-10-10 617 GFP_KERNEL);
^1da177e4c3f41 Linus Torvalds 2005-04-16 618 if (dev_info == NULL) {
^1da177e4c3f41 Linus Torvalds 2005-04-16 619 rc = -ENOMEM;
^1da177e4c3f41 Linus Torvalds 2005-04-16 620 goto out;
^1da177e4c3f41 Linus Torvalds 2005-04-16 621 }
^1da177e4c3f41 Linus Torvalds 2005-04-16 622 strcpy(dev_info->segment_name, local_buf);
b2300b9efe1b81 Hongjie Yang 2008-10-10 623 dev_info->segment_type = seg_info->segment_type;
b2300b9efe1b81 Hongjie Yang 2008-10-10 624 INIT_LIST_HEAD(&dev_info->seg_list);
b2300b9efe1b81 Hongjie Yang 2008-10-10 625 }
b2300b9efe1b81 Hongjie Yang 2008-10-10 626 list_add_tail(&seg_info->lh, &dev_info->seg_list);
b2300b9efe1b81 Hongjie Yang 2008-10-10 627 num_of_segments++;
b2300b9efe1b81 Hongjie Yang 2008-10-10 628 i = j;
b2300b9efe1b81 Hongjie Yang 2008-10-10 629
b2300b9efe1b81 Hongjie Yang 2008-10-10 630 if ((buf[j] == '\0') || (buf[j] == '\n'))
b2300b9efe1b81 Hongjie Yang 2008-10-10 631 break;
b2300b9efe1b81 Hongjie Yang 2008-10-10 632 }
b2300b9efe1b81 Hongjie Yang 2008-10-10 633
b2300b9efe1b81 Hongjie Yang 2008-10-10 634 /* no trailing colon at the end of the input */
b2300b9efe1b81 Hongjie Yang 2008-10-10 635 if ((i > 0) && (buf[i-1] == ':')) {
b2300b9efe1b81 Hongjie Yang 2008-10-10 636 rc = -ENAMETOOLONG;
b2300b9efe1b81 Hongjie Yang 2008-10-10 637 goto seg_list_del;
b2300b9efe1b81 Hongjie Yang 2008-10-10 638 }
820109fb11f24b Wolfram Sang 2022-08-18 639 strscpy(local_buf, buf, i + 1);
b2300b9efe1b81 Hongjie Yang 2008-10-10 640 dev_info->num_of_segments = num_of_segments;
b2300b9efe1b81 Hongjie Yang 2008-10-10 641 rc = dcssblk_is_continuous(dev_info);
b2300b9efe1b81 Hongjie Yang 2008-10-10 642 if (rc < 0)
b2300b9efe1b81 Hongjie Yang 2008-10-10 643 goto seg_list_del;
b2300b9efe1b81 Hongjie Yang 2008-10-10 644
b2300b9efe1b81 Hongjie Yang 2008-10-10 645 dev_info->start = dcssblk_find_lowest_addr(dev_info);
b2300b9efe1b81 Hongjie Yang 2008-10-10 646 dev_info->end = dcssblk_find_highest_addr(dev_info);
b2300b9efe1b81 Hongjie Yang 2008-10-10 647
ef283688f54cc8 Kees Cook 2014-06-10 648 dev_set_name(&dev_info->dev, "%s", dev_info->segment_name);
^1da177e4c3f41 Linus Torvalds 2005-04-16 649 dev_info->dev.release = dcssblk_release_segment;
521b3d790c16fa Sebastian Ott 2012-10-01 650 dev_info->dev.groups = dcssblk_dev_attr_groups;
^1da177e4c3f41 Linus Torvalds 2005-04-16 651 INIT_LIST_HEAD(&dev_info->lh);
af190c53c995bf Christoph Hellwig 2024-02-15 652 dev_info->gd = blk_alloc_disk(&lim, NUMA_NO_NODE);
74fa8f9c553f7b Christoph Hellwig 2024-02-15 653 if (IS_ERR(dev_info->gd)) {
74fa8f9c553f7b Christoph Hellwig 2024-02-15 654 rc = PTR_ERR(dev_info->gd);
b2300b9efe1b81 Hongjie Yang 2008-10-10 655 goto seg_list_del;
^1da177e4c3f41 Linus Torvalds 2005-04-16 656 }
^1da177e4c3f41 Linus Torvalds 2005-04-16 657 dev_info->gd->major = dcssblk_major;
0692ef289f067d Christoph Hellwig 2021-05-21 658 dev_info->gd->minors = DCSSBLK_MINORS_PER_DISK;
^1da177e4c3f41 Linus Torvalds 2005-04-16 659 dev_info->gd->fops = &dcssblk_devops;
^1da177e4c3f41 Linus Torvalds 2005-04-16 660 dev_info->gd->private_data = dev_info;
a41a11b4009580 Gerald Schaefer 2022-10-27 661 dev_info->gd->flags |= GENHD_FL_NO_PART;
b2300b9efe1b81 Hongjie Yang 2008-10-10 662
^1da177e4c3f41 Linus Torvalds 2005-04-16 663 seg_byte_size = (dev_info->end - dev_info->start + 1);
^1da177e4c3f41 Linus Torvalds 2005-04-16 664 set_capacity(dev_info->gd, seg_byte_size >> 9); // size in sectors
93098bf0157876 Hongjie Yang 2008-12-25 665 pr_info("Loaded %s with total size %lu bytes and capacity %lu "
93098bf0157876 Hongjie Yang 2008-12-25 666 "sectors\n", local_buf, seg_byte_size, seg_byte_size >> 9);
^1da177e4c3f41 Linus Torvalds 2005-04-16 667
^1da177e4c3f41 Linus Torvalds 2005-04-16 668 dev_info->save_pending = 0;
^1da177e4c3f41 Linus Torvalds 2005-04-16 669 dev_info->is_shared = 1;
^1da177e4c3f41 Linus Torvalds 2005-04-16 670 dev_info->dev.parent = dcssblk_root_dev;
^1da177e4c3f41 Linus Torvalds 2005-04-16 671
^1da177e4c3f41 Linus Torvalds 2005-04-16 672 /*
^1da177e4c3f41 Linus Torvalds 2005-04-16 673 *get minor, add to list
^1da177e4c3f41 Linus Torvalds 2005-04-16 674 */
^1da177e4c3f41 Linus Torvalds 2005-04-16 675 down_write(&dcssblk_devices_sem);
b2300b9efe1b81 Hongjie Yang 2008-10-10 676 if (dcssblk_get_segment_by_name(local_buf)) {
04f64b5756872b Gerald Schaefer 2008-08-21 677 rc = -EEXIST;
b2300b9efe1b81 Hongjie Yang 2008-10-10 678 goto release_gd;
04f64b5756872b Gerald Schaefer 2008-08-21 679 }
^1da177e4c3f41 Linus Torvalds 2005-04-16 680 rc = dcssblk_assign_free_minor(dev_info);
b2300b9efe1b81 Hongjie Yang 2008-10-10 681 if (rc)
b2300b9efe1b81 Hongjie Yang 2008-10-10 682 goto release_gd;
^1da177e4c3f41 Linus Torvalds 2005-04-16 683 sprintf(dev_info->gd->disk_name, "dcssblk%d",
d0591485e15ccd Gerald Schaefer 2009-06-12 684 dev_info->gd->first_minor);
^1da177e4c3f41 Linus Torvalds 2005-04-16 685 list_add_tail(&dev_info->lh, &dcssblk_devices);
^1da177e4c3f41 Linus Torvalds 2005-04-16 686
^1da177e4c3f41 Linus Torvalds 2005-04-16 687 if (!try_module_get(THIS_MODULE)) {
^1da177e4c3f41 Linus Torvalds 2005-04-16 688 rc = -ENODEV;
b2300b9efe1b81 Hongjie Yang 2008-10-10 689 goto dev_list_del;
^1da177e4c3f41 Linus Torvalds 2005-04-16 690 }
^1da177e4c3f41 Linus Torvalds 2005-04-16 691 /*
^1da177e4c3f41 Linus Torvalds 2005-04-16 692 * register the device
^1da177e4c3f41 Linus Torvalds 2005-04-16 693 */
^1da177e4c3f41 Linus Torvalds 2005-04-16 694 rc = device_register(&dev_info->dev);
b2300b9efe1b81 Hongjie Yang 2008-10-10 695 if (rc)
521b3d790c16fa Sebastian Ott 2012-10-01 696 goto put_dev;
^1da177e4c3f41 Linus Torvalds 2005-04-16 697
e265834f5da2c4 Dan Williams 2024-09-24 698 rc = dcssblk_setup_dax(dev_info);
fb08a1908cb119 Christoph Hellwig 2021-11-29 699 if (rc)
fb08a1908cb119 Christoph Hellwig 2021-11-29 700 goto out_dax;
7a2765f6e82063 Dan Williams 2017-01-26 701
521b3d790c16fa Sebastian Ott 2012-10-01 702 get_device(&dev_info->dev);
1a5db707c859a4 Gerald Schaefer 2021-09-27 703 rc = device_add_disk(&dev_info->dev, dev_info->gd, NULL);
1a5db707c859a4 Gerald Schaefer 2021-09-27 704 if (rc)
fb08a1908cb119 Christoph Hellwig 2021-11-29 705 goto out_dax_host;
436d1bc7fe6e78 Christian Borntraeger 2007-12-04 706
^1da177e4c3f41 Linus Torvalds 2005-04-16 707 switch (dev_info->segment_type) {
^1da177e4c3f41 Linus Torvalds 2005-04-16 708 case SEG_TYPE_SR:
^1da177e4c3f41 Linus Torvalds 2005-04-16 709 case SEG_TYPE_ER:
^1da177e4c3f41 Linus Torvalds 2005-04-16 710 case SEG_TYPE_SC:
^1da177e4c3f41 Linus Torvalds 2005-04-16 711 set_disk_ro(dev_info->gd,1);
^1da177e4c3f41 Linus Torvalds 2005-04-16 712 break;
^1da177e4c3f41 Linus Torvalds 2005-04-16 713 default:
^1da177e4c3f41 Linus Torvalds 2005-04-16 714 set_disk_ro(dev_info->gd,0);
^1da177e4c3f41 Linus Torvalds 2005-04-16 715 break;
^1da177e4c3f41 Linus Torvalds 2005-04-16 716 }
^1da177e4c3f41 Linus Torvalds 2005-04-16 717 up_write(&dcssblk_devices_sem);
^1da177e4c3f41 Linus Torvalds 2005-04-16 718 rc = count;
^1da177e4c3f41 Linus Torvalds 2005-04-16 719 goto out;
^1da177e4c3f41 Linus Torvalds 2005-04-16 720
fb08a1908cb119 Christoph Hellwig 2021-11-29 721 out_dax_host:
c8f40a0bccefd6 Gerald Schaefer 2023-08-10 722 put_device(&dev_info->dev);
fb08a1908cb119 Christoph Hellwig 2021-11-29 723 dax_remove_host(dev_info->gd);
1a5db707c859a4 Gerald Schaefer 2021-09-27 724 out_dax:
1a5db707c859a4 Gerald Schaefer 2021-09-27 725 kill_dax(dev_info->dax_dev);
1a5db707c859a4 Gerald Schaefer 2021-09-27 726 put_dax(dev_info->dax_dev);
521b3d790c16fa Sebastian Ott 2012-10-01 727 put_dev:
^1da177e4c3f41 Linus Torvalds 2005-04-16 728 list_del(&dev_info->lh);
8b9ab62662048a Christoph Hellwig 2022-06-19 729 put_disk(dev_info->gd);
b2300b9efe1b81 Hongjie Yang 2008-10-10 730 list_for_each_entry(seg_info, &dev_info->seg_list, lh) {
b2300b9efe1b81 Hongjie Yang 2008-10-10 731 segment_unload(seg_info->segment_name);
b2300b9efe1b81 Hongjie Yang 2008-10-10 732 }
^1da177e4c3f41 Linus Torvalds 2005-04-16 733 put_device(&dev_info->dev);
^1da177e4c3f41 Linus Torvalds 2005-04-16 734 up_write(&dcssblk_devices_sem);
^1da177e4c3f41 Linus Torvalds 2005-04-16 735 goto out;
b2300b9efe1b81 Hongjie Yang 2008-10-10 736 dev_list_del:
^1da177e4c3f41 Linus Torvalds 2005-04-16 737 list_del(&dev_info->lh);
b2300b9efe1b81 Hongjie Yang 2008-10-10 738 release_gd:
8b9ab62662048a Christoph Hellwig 2022-06-19 739 put_disk(dev_info->gd);
b2300b9efe1b81 Hongjie Yang 2008-10-10 740 up_write(&dcssblk_devices_sem);
b2300b9efe1b81 Hongjie Yang 2008-10-10 741 seg_list_del:
b2300b9efe1b81 Hongjie Yang 2008-10-10 742 if (dev_info == NULL)
b2300b9efe1b81 Hongjie Yang 2008-10-10 743 goto out;
b2300b9efe1b81 Hongjie Yang 2008-10-10 744 list_for_each_entry_safe(seg_info, temp, &dev_info->seg_list, lh) {
b2300b9efe1b81 Hongjie Yang 2008-10-10 745 list_del(&seg_info->lh);
b2300b9efe1b81 Hongjie Yang 2008-10-10 746 segment_unload(seg_info->segment_name);
b2300b9efe1b81 Hongjie Yang 2008-10-10 747 kfree(seg_info);
b2300b9efe1b81 Hongjie Yang 2008-10-10 748 }
^1da177e4c3f41 Linus Torvalds 2005-04-16 749 kfree(dev_info);
^1da177e4c3f41 Linus Torvalds 2005-04-16 750 out:
^1da177e4c3f41 Linus Torvalds 2005-04-16 751 kfree(local_buf);
^1da177e4c3f41 Linus Torvalds 2005-04-16 752 out_nobuf:
^1da177e4c3f41 Linus Torvalds 2005-04-16 753 return rc;
^1da177e4c3f41 Linus Torvalds 2005-04-16 754 }
^1da177e4c3f41 Linus Torvalds 2005-04-16 755
On Tue, Sep 24, 2024 at 03:59:08PM -0700, Dan Williams wrote: Hi Dan, > The dcssblk driver has long needed special case supoprt to enable > limited dax operation, so called CONFIG_FS_DAX_LIMITED. This mode > works around the incomplete support for ZONE_DEVICE on s390 by forgoing > the ability of dax-mapped pages to support GUP. > > Now, pending cleanups to fsdax that fix its reference counting [1] depend on > the ability of all dax drivers to supply ZONE_DEVICE pages. > > To allow that work to move forward, dax support needs to be paused for > dcssblk until ZONE_DEVICE support arrives. That work has been known for > a few years [2], and the removal of "pte_devmap" requirements [3] makes the > conversion easier. > > For now, place the support behind CONFIG_BROKEN, and remove PFN_SPECIAL > (dcssblk was the only user). > > Link: http://lore.kernel.org/cover.9f0e45d52f5cff58807831b6b867084d0b14b61c.1725941415.git-series.apopple@nvidia.com [1] > Link: http://lore.kernel.org/20210820210318.187742e8@thinkpad/ [2] > Link: http://lore.kernel.org/4511465a4f8429f45e2ac70d2e65dc5e1df1eb47.1725941415.git-series.apopple@nvidia.com [3] ... > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > --- > drivers/s390/block/Kconfig | 12 ++++++++++-- > drivers/s390/block/dcssblk.c | 26 +++++++++++++++++--------- > fs/Kconfig | 9 +-------- > fs/dax.c | 12 ------------ > include/linux/pfn_t.h | 15 --------------- > mm/memory.c | 2 -- > mm/memremap.c | 4 ---- > 7 files changed, 28 insertions(+), 52 deletions(-) ... I guess you want to remove dcssblk from Documentation/filesystems/dax.rst. Gerald is back from vacation on Monday and he will likely comment on this. Tested-by: Alexander Gordeev <agordeev@linux.ibm.com> Thanks!
On 25.09.24 00:59, Dan Williams wrote: > The dcssblk driver has long needed special case supoprt to enable > limited dax operation, so called CONFIG_FS_DAX_LIMITED. This mode > works around the incomplete support for ZONE_DEVICE on s390 by forgoing > the ability of dax-mapped pages to support GUP. > > Now, pending cleanups to fsdax that fix its reference counting [1] depend on > the ability of all dax drivers to supply ZONE_DEVICE pages. > > To allow that work to move forward, dax support needs to be paused for > dcssblk until ZONE_DEVICE support arrives. That work has been known for > a few years [2], and the removal of "pte_devmap" requirements [3] makes the > conversion easier. > > For now, place the support behind CONFIG_BROKEN, and remove PFN_SPECIAL > (dcssblk was the only user). > > Link: http://lore.kernel.org/cover.9f0e45d52f5cff58807831b6b867084d0b14b61c.1725941415.git-series.apopple@nvidia.com [1] > Link: http://lore.kernel.org/20210820210318.187742e8@thinkpad/ [2] > Link: http://lore.kernel.org/4511465a4f8429f45e2ac70d2e65dc5e1df1eb47.1725941415.git-series.apopple@nvidia.com [3] > Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> > Cc: Heiko Carstens <hca@linux.ibm.com> > Cc: Vasily Gorbik <gor@linux.ibm.com> > Cc: Alexander Gordeev <agordeev@linux.ibm.com> > Cc: Christian Borntraeger <borntraeger@linux.ibm.com> > Cc: Sven Schnelle <svens@linux.ibm.com> > Cc: Jan Kara <jack@suse.cz> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: Christoph Hellwig <hch@lst.de> > Cc: Alistair Popple <apopple@nvidia.com> > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > --- Acked-by: David Hildenbrand <david@redhat.com>
On Tue, 24 Sep 2024 15:59:08 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > The dcssblk driver has long needed special case supoprt to enable > limited dax operation, so called CONFIG_FS_DAX_LIMITED. This mode > works around the incomplete support for ZONE_DEVICE on s390 by forgoing > the ability of dax-mapped pages to support GUP. > > Now, pending cleanups to fsdax that fix its reference counting [1] depend on > the ability of all dax drivers to supply ZONE_DEVICE pages. > > To allow that work to move forward, dax support needs to be paused for > dcssblk until ZONE_DEVICE support arrives. That work has been known for > a few years [2], and the removal of "pte_devmap" requirements [3] makes the > conversion easier. Thanks, that's great news! Without requiring the extra PTE bit, it should now finally be possible to add struct pages and ZONE_DEVICE support for dcssblk. In the meantime, it is OK to pause the DAX support for dcssblk as you suggested, and finally remove that ugly CONFIG_FS_DAX_LIMITED. Thanks for bearing with us for so long! > > For now, place the support behind CONFIG_BROKEN, and remove PFN_SPECIAL > (dcssblk was the only user). Ok, I guess that PFN_SPECIAL was there because we had no struct pages for the DCSS memory. When we come back, with proper ZONE_DEVICE and struct pages, it should not be needed any more. And yes, the chance to completely remove pfn_t, after Alistair's series, is quite impressive and even more motivation than CONFIG_FS_DAX_LIMITED. > > Link: http://lore.kernel.org/cover.9f0e45d52f5cff58807831b6b867084d0b14b61c.1725941415.git-series.apopple@nvidia.com [1] > Link: http://lore.kernel.org/20210820210318.187742e8@thinkpad/ [2] > Link: http://lore.kernel.org/4511465a4f8429f45e2ac70d2e65dc5e1df1eb47.1725941415.git-series.apopple@nvidia.com [3] > Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> > Cc: Heiko Carstens <hca@linux.ibm.com> > Cc: Vasily Gorbik <gor@linux.ibm.com> > Cc: Alexander Gordeev <agordeev@linux.ibm.com> > Cc: Christian Borntraeger <borntraeger@linux.ibm.com> > Cc: Sven Schnelle <svens@linux.ibm.com> > Cc: Jan Kara <jack@suse.cz> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: Christoph Hellwig <hch@lst.de> > Cc: Alistair Popple <apopple@nvidia.com> > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > --- > drivers/s390/block/Kconfig | 12 ++++++++++-- > drivers/s390/block/dcssblk.c | 26 +++++++++++++++++--------- > fs/Kconfig | 9 +-------- > fs/dax.c | 12 ------------ > include/linux/pfn_t.h | 15 --------------- > mm/memory.c | 2 -- > mm/memremap.c | 4 ---- > 7 files changed, 28 insertions(+), 52 deletions(-) When you also remove the now unused dax_dev definition at the top of dcssblk_add_store(), as noticed by kernel test robot: Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Gerald Schaefer wrote: [..] > When you also remove the now unused dax_dev definition at the top of > dcssblk_add_store(), as noticed by kernel test robot: Yup, already have that fixup locally. > Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Thanks!
diff --git a/drivers/s390/block/Kconfig b/drivers/s390/block/Kconfig index e3710a762aba..4bfe469c04aa 100644 --- a/drivers/s390/block/Kconfig +++ b/drivers/s390/block/Kconfig @@ -4,13 +4,21 @@ comment "S/390 block device drivers" config DCSSBLK def_tristate m - select FS_DAX_LIMITED - select DAX prompt "DCSSBLK support" depends on S390 && BLOCK help Support for dcss block device +config DCSSBLK_DAX + def_bool y + depends on DCSSBLK + # requires S390 ZONE_DEVICE support + depends on BROKEN + select DAX + prompt "DCSSBLK DAX support" + help + Enable DAX operation for the dcss block device + config DASD def_tristate y prompt "Support for DASD devices" diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c index 02a4a51da1b7..d1bc79cf56bd 100644 --- a/drivers/s390/block/dcssblk.c +++ b/drivers/s390/block/dcssblk.c @@ -540,6 +540,21 @@ static const struct attribute_group *dcssblk_dev_attr_groups[] = { NULL, }; +static int dcssblk_setup_dax(struct dcssblk_dev_info *dev_info) +{ + struct dax_device *dax_dev; + + if (!IS_ENABLED(CONFIG_DCSSBLK_DAX)) + return 0; + + dax_dev = alloc_dax(dev_info, &dcssblk_dax_ops); + if (IS_ERR(dax_dev)) + return PTR_ERR(dax_dev); + set_dax_synchronous(dax_dev); + dev_info->dax_dev = dax_dev; + return dax_add_host(dev_info->dax_dev, dev_info->gd); +} + /* * device attribute for adding devices */ @@ -680,14 +695,7 @@ dcssblk_add_store(struct device *dev, struct device_attribute *attr, const char if (rc) goto put_dev; - dax_dev = alloc_dax(dev_info, &dcssblk_dax_ops); - if (IS_ERR(dax_dev)) { - rc = PTR_ERR(dax_dev); - goto put_dev; - } - set_dax_synchronous(dax_dev); - dev_info->dax_dev = dax_dev; - rc = dax_add_host(dev_info->dax_dev, dev_info->gd); + rc = dcssblk_setup_dax(dev_info); if (rc) goto out_dax; @@ -923,7 +931,7 @@ __dcssblk_direct_access(struct dcssblk_dev_info *dev_info, pgoff_t pgoff, *kaddr = __va(dev_info->start + offset); if (pfn) *pfn = __pfn_to_pfn_t(PFN_DOWN(dev_info->start + offset), - PFN_DEV|PFN_SPECIAL); + PFN_DEV); return (dev_sz - offset) / PAGE_SIZE; } diff --git a/fs/Kconfig b/fs/Kconfig index 0e4efec1d92e..a6f4f28fa09e 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -60,7 +60,7 @@ endif # BLOCK config FS_DAX bool "File system based Direct Access (DAX) support" depends on MMU - depends on ZONE_DEVICE || FS_DAX_LIMITED + depends on ZONE_DEVICE select FS_IOMAP select DAX help @@ -96,13 +96,6 @@ config FS_DAX_PMD depends on ZONE_DEVICE depends on TRANSPARENT_HUGEPAGE -# Selected by DAX drivers that do not expect filesystem DAX to support -# get_user_pages() of DAX mappings. I.e. "limited" indicates no support -# for fork() of processes with MAP_SHARED mappings or support for -# direct-I/O to a DAX mapping. -config FS_DAX_LIMITED - bool - # Posix ACL utility routines # # Note: Posix ACLs can be implemented without these helpers. Never use diff --git a/fs/dax.c b/fs/dax.c index becb4a6920c6..6257d3fdf8f8 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -359,9 +359,6 @@ static void dax_associate_entry(void *entry, struct address_space *mapping, unsigned long size = dax_entry_size(entry), pfn, index; int i = 0; - if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) - return; - index = linear_page_index(vma, address & ~(size - 1)); for_each_mapped_pfn(entry, pfn) { struct page *page = pfn_to_page(pfn); @@ -381,9 +378,6 @@ static void dax_disassociate_entry(void *entry, struct address_space *mapping, { unsigned long pfn; - if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) - return; - for_each_mapped_pfn(entry, pfn) { struct page *page = pfn_to_page(pfn); @@ -684,12 +678,6 @@ struct page *dax_layout_busy_page_range(struct address_space *mapping, pgoff_t end_idx; XA_STATE(xas, &mapping->i_pages, start_idx); - /* - * In the 'limited' case get_user_pages() for dax is disabled. - */ - if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) - return NULL; - if (!dax_mapping(mapping) || !mapping_mapped(mapping)) return NULL; diff --git a/include/linux/pfn_t.h b/include/linux/pfn_t.h index 2d9148221e9a..eb8da94d1d19 100644 --- a/include/linux/pfn_t.h +++ b/include/linux/pfn_t.h @@ -9,18 +9,14 @@ * PFN_SG_LAST - pfn references a page and is the last scatterlist entry * PFN_DEV - pfn is not covered by system memmap by default * PFN_MAP - pfn has a dynamic page mapping established by a device driver - * PFN_SPECIAL - for CONFIG_FS_DAX_LIMITED builds to allow XIP, but not - * get_user_pages */ #define PFN_FLAGS_MASK (((u64) (~PAGE_MASK)) << (BITS_PER_LONG_LONG - PAGE_SHIFT)) #define PFN_SG_CHAIN (1ULL << (BITS_PER_LONG_LONG - 1)) #define PFN_SG_LAST (1ULL << (BITS_PER_LONG_LONG - 2)) #define PFN_DEV (1ULL << (BITS_PER_LONG_LONG - 3)) #define PFN_MAP (1ULL << (BITS_PER_LONG_LONG - 4)) -#define PFN_SPECIAL (1ULL << (BITS_PER_LONG_LONG - 5)) #define PFN_FLAGS_TRACE \ - { PFN_SPECIAL, "SPECIAL" }, \ { PFN_SG_CHAIN, "SG_CHAIN" }, \ { PFN_SG_LAST, "SG_LAST" }, \ { PFN_DEV, "DEV" }, \ @@ -117,15 +113,4 @@ pud_t pud_mkdevmap(pud_t pud); #endif #endif /* CONFIG_ARCH_HAS_PTE_DEVMAP */ -#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL -static inline bool pfn_t_special(pfn_t pfn) -{ - return (pfn.val & PFN_SPECIAL) == PFN_SPECIAL; -} -#else -static inline bool pfn_t_special(pfn_t pfn) -{ - return false; -} -#endif /* CONFIG_ARCH_HAS_PTE_SPECIAL */ #endif /* _LINUX_PFN_T_H_ */ diff --git a/mm/memory.c b/mm/memory.c index c31ea300cdf6..676f5cda992a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2462,8 +2462,6 @@ static bool vm_mixed_ok(struct vm_area_struct *vma, pfn_t pfn, bool mkwrite) return true; if (pfn_t_devmap(pfn)) return true; - if (pfn_t_special(pfn)) - return true; if (is_zero_pfn(pfn_t_to_pfn(pfn))) return true; return false; diff --git a/mm/memremap.c b/mm/memremap.c index 40d4547ce514..a6bbbe180eab 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -332,10 +332,6 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid) } break; case MEMORY_DEVICE_FS_DAX: - if (IS_ENABLED(CONFIG_FS_DAX_LIMITED)) { - WARN(1, "File system DAX not supported\n"); - return ERR_PTR(-EINVAL); - } params.pgprot = pgprot_decrypted(params.pgprot); break; case MEMORY_DEVICE_GENERIC:
The dcssblk driver has long needed special case supoprt to enable limited dax operation, so called CONFIG_FS_DAX_LIMITED. This mode works around the incomplete support for ZONE_DEVICE on s390 by forgoing the ability of dax-mapped pages to support GUP. Now, pending cleanups to fsdax that fix its reference counting [1] depend on the ability of all dax drivers to supply ZONE_DEVICE pages. To allow that work to move forward, dax support needs to be paused for dcssblk until ZONE_DEVICE support arrives. That work has been known for a few years [2], and the removal of "pte_devmap" requirements [3] makes the conversion easier. For now, place the support behind CONFIG_BROKEN, and remove PFN_SPECIAL (dcssblk was the only user). Link: http://lore.kernel.org/cover.9f0e45d52f5cff58807831b6b867084d0b14b61c.1725941415.git-series.apopple@nvidia.com [1] Link: http://lore.kernel.org/20210820210318.187742e8@thinkpad/ [2] Link: http://lore.kernel.org/4511465a4f8429f45e2ac70d2e65dc5e1df1eb47.1725941415.git-series.apopple@nvidia.com [3] Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Alistair Popple <apopple@nvidia.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- drivers/s390/block/Kconfig | 12 ++++++++++-- drivers/s390/block/dcssblk.c | 26 +++++++++++++++++--------- fs/Kconfig | 9 +-------- fs/dax.c | 12 ------------ include/linux/pfn_t.h | 15 --------------- mm/memory.c | 2 -- mm/memremap.c | 4 ---- 7 files changed, 28 insertions(+), 52 deletions(-)