Message ID | e98b7e6bed4c1c63feac7b907439168388ecc9fd.1738709036.git-series.apopple@nvidia.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | fs/dax: Fix ZONE_DEVICE page reference counts | expand |
On 04.02.25 23:48, Alistair Popple wrote: > Currently to map a DAX page the DAX driver calls vmf_insert_pfn. This > creates a special devmap PTE entry for the pfn but does not take a > reference on the underlying struct page for the mapping. This is > because DAX page refcounts are treated specially, as indicated by the > presence of a devmap entry. > > To allow DAX page refcounts to be managed the same as normal page > refcounts introduce vmf_insert_page_mkwrite(). This will take a > reference on the underlying page much the same as vmf_insert_page, > except it also permits upgrading an existing mapping to be writable if > requested/possible. > > Signed-off-by: Alistair Popple <apopple@nvidia.com> > > --- > > Changes for v7: > - Fix vmf_insert_page_mkwrite by removing pfn gunk as suggested by > David. > > Updates from v2: > > - Rename function to make not DAX specific > > - Split the insert_page_into_pte_locked() change into a separate > patch. > > Updates from v1: > > - Re-arrange code in insert_page_into_pte_locked() based on comments > from Jan Kara. > > - Call mkdrity/mkyoung for the mkwrite case, also suggested by Jan. > --- > include/linux/mm.h | 2 ++ > mm/memory.c | 21 +++++++++++++++++++++ > 2 files changed, 23 insertions(+) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 7b1068d..6567ece 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -3544,6 +3544,8 @@ int vm_map_pages(struct vm_area_struct *vma, struct page **pages, > unsigned long num); > int vm_map_pages_zero(struct vm_area_struct *vma, struct page **pages, > unsigned long num); > +vm_fault_t vmf_insert_page_mkwrite(struct vm_fault *vmf, struct page *page, > + bool write); > vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr, > unsigned long pfn); > vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, > diff --git a/mm/memory.c b/mm/memory.c > index 41befd9..b88b488 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -2622,6 +2622,27 @@ static vm_fault_t __vm_insert_mixed(struct vm_area_struct *vma, > return VM_FAULT_NOPAGE; > } > > +vm_fault_t vmf_insert_page_mkwrite(struct vm_fault *vmf, struct page *page, > + bool write) > +{ > + struct vm_area_struct *vma = vmf->vma; > + pgprot_t pgprot = vma->vm_page_prot; Probably could have avoided that temp without harming readability Acked-by: David Hildenbrand <david@redhat.com>
diff --git a/include/linux/mm.h b/include/linux/mm.h index 7b1068d..6567ece 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3544,6 +3544,8 @@ int vm_map_pages(struct vm_area_struct *vma, struct page **pages, unsigned long num); int vm_map_pages_zero(struct vm_area_struct *vma, struct page **pages, unsigned long num); +vm_fault_t vmf_insert_page_mkwrite(struct vm_fault *vmf, struct page *page, + bool write); vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn); vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr, diff --git a/mm/memory.c b/mm/memory.c index 41befd9..b88b488 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2622,6 +2622,27 @@ static vm_fault_t __vm_insert_mixed(struct vm_area_struct *vma, return VM_FAULT_NOPAGE; } +vm_fault_t vmf_insert_page_mkwrite(struct vm_fault *vmf, struct page *page, + bool write) +{ + struct vm_area_struct *vma = vmf->vma; + pgprot_t pgprot = vma->vm_page_prot; + unsigned long addr = vmf->address; + int err; + + if (addr < vma->vm_start || addr >= vma->vm_end) + return VM_FAULT_SIGBUS; + + err = insert_page(vma, addr, page, pgprot, write); + if (err == -ENOMEM) + return VM_FAULT_OOM; + if (err < 0 && err != -EBUSY) + return VM_FAULT_SIGBUS; + + return VM_FAULT_NOPAGE; +} +EXPORT_SYMBOL_GPL(vmf_insert_page_mkwrite); + vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr, pfn_t pfn) {
Currently to map a DAX page the DAX driver calls vmf_insert_pfn. This creates a special devmap PTE entry for the pfn but does not take a reference on the underlying struct page for the mapping. This is because DAX page refcounts are treated specially, as indicated by the presence of a devmap entry. To allow DAX page refcounts to be managed the same as normal page refcounts introduce vmf_insert_page_mkwrite(). This will take a reference on the underlying page much the same as vmf_insert_page, except it also permits upgrading an existing mapping to be writable if requested/possible. Signed-off-by: Alistair Popple <apopple@nvidia.com> --- Changes for v7: - Fix vmf_insert_page_mkwrite by removing pfn gunk as suggested by David. Updates from v2: - Rename function to make not DAX specific - Split the insert_page_into_pte_locked() change into a separate patch. Updates from v1: - Re-arrange code in insert_page_into_pte_locked() based on comments from Jan Kara. - Call mkdrity/mkyoung for the mkwrite case, also suggested by Jan. --- include/linux/mm.h | 2 ++ mm/memory.c | 21 +++++++++++++++++++++ 2 files changed, 23 insertions(+)