Message ID | 68974d46091eea460f404f8ced3c6de5964c9ec4.1736488799.git-series.apopple@nvidia.com |
---|---|
State | New |
Headers | show |
Series | fs/dax: Fix ZONE_DEVICE page reference counts | expand |
Alistair Popple wrote: > In preparation for using insert_page() for DAX, enhance > insert_page_into_pte_locked() to handle establishing writable > mappings. Recall that DAX returns VM_FAULT_NOPAGE after installing a > PTE which bypasses the typical set_pte_range() in finish_fault. > > Signed-off-by: Alistair Popple <apopple@nvidia.com> > Suggested-by: Dan Williams <dan.j.williams@intel.com> > > --- > > Changes for v5: > - Minor comment/formatting fixes suggested by David Hildenbrand > > Changes since v2: > > - New patch split out from "mm/memory: Add dax_insert_pfn" > --- > mm/memory.c | 37 +++++++++++++++++++++++++++++-------- > 1 file changed, 29 insertions(+), 8 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 06bb29e..8531acb 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -2126,19 +2126,40 @@ static int validate_page_before_insert(struct vm_area_struct *vma, > } > > static int insert_page_into_pte_locked(struct vm_area_struct *vma, pte_t *pte, > - unsigned long addr, struct page *page, pgprot_t prot) > + unsigned long addr, struct page *page, > + pgprot_t prot, bool mkwrite) > { > struct folio *folio = page_folio(page); > + pte_t entry = ptep_get(pte); > pte_t pteval; > > - if (!pte_none(ptep_get(pte))) > - return -EBUSY; > + if (!pte_none(entry)) { > + if (!mkwrite) > + return -EBUSY; > + > + /* see insert_pfn(). */ > + if (pte_pfn(entry) != page_to_pfn(page)) { > + WARN_ON_ONCE(!is_zero_pfn(pte_pfn(entry))); > + return -EFAULT; > + } > + entry = maybe_mkwrite(entry, vma); > + entry = pte_mkyoung(entry); > + if (ptep_set_access_flags(vma, addr, pte, entry, 1)) > + update_mmu_cache(vma, addr, pte); > + return 0; > + } This hunk feels like it is begging to be unified with insert_pfn() after pfn_t dies. Perhaps a TODO to remember to come back and unify them, or you can go append that work to your pfn_t removal series? Other than that you can add: Reviewed-by: Dan Williams <dan.j.williams@intel.com>
On 10.01.25 07:00, Alistair Popple wrote: > In preparation for using insert_page() for DAX, enhance > insert_page_into_pte_locked() to handle establishing writable > mappings. Recall that DAX returns VM_FAULT_NOPAGE after installing a > PTE which bypasses the typical set_pte_range() in finish_fault. > > Signed-off-by: Alistair Popple <apopple@nvidia.com> > Suggested-by: Dan Williams <dan.j.williams@intel.com> > > --- > > Changes for v5: > - Minor comment/formatting fixes suggested by David Hildenbrand > > Changes since v2: > > - New patch split out from "mm/memory: Add dax_insert_pfn" > --- > mm/memory.c | 37 +++++++++++++++++++++++++++++-------- > 1 file changed, 29 insertions(+), 8 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 06bb29e..8531acb 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -2126,19 +2126,40 @@ static int validate_page_before_insert(struct vm_area_struct *vma, > } > > static int insert_page_into_pte_locked(struct vm_area_struct *vma, pte_t *pte, > - unsigned long addr, struct page *page, pgprot_t prot) > + unsigned long addr, struct page *page, > + pgprot_t prot, bool mkwrite) > { > struct folio *folio = page_folio(page); > + pte_t entry = ptep_get(pte); > pte_t pteval; > Just drop "entry" and reuse "pteval"; even saves you from one bug below :) pte_t pteval = ptep_get(pte); > - if (!pte_none(ptep_get(pte))) > - return -EBUSY; > + if (!pte_none(entry)) { > + if (!mkwrite) > + return -EBUSY; > + > + /* see insert_pfn(). */ > + if (pte_pfn(entry) != page_to_pfn(page)) { > + WARN_ON_ONCE(!is_zero_pfn(pte_pfn(entry))); > + return -EFAULT; > + } > + entry = maybe_mkwrite(entry, vma); > + entry = pte_mkyoung(entry); > + if (ptep_set_access_flags(vma, addr, pte, entry, 1)) > + update_mmu_cache(vma, addr, pte); > + return 0; > + } > + > /* Ok, finally just insert the thing.. */ > pteval = mk_pte(page, prot); > if (unlikely(is_zero_folio(folio))) { > pteval = pte_mkspecial(pteval); > } else { > folio_get(folio); > + entry = mk_pte(page, prot); we already do "pteval = mk_pte(page, prot);" above? And I think your change here does not do what you want, because you modify the new "entry" but we do set_pte_at(vma->vm_mm, addr, pte, pteval); below ... > + if (mkwrite) { > + entry = pte_mkyoung(entry); > + entry = maybe_mkwrite(pte_mkdirty(entry), vma); > + } So again, better just reuse pteval :)
On Mon, Jan 13, 2025 at 05:08:31PM -0800, Dan Williams wrote: > Alistair Popple wrote: > > In preparation for using insert_page() for DAX, enhance > > insert_page_into_pte_locked() to handle establishing writable > > mappings. Recall that DAX returns VM_FAULT_NOPAGE after installing a > > PTE which bypasses the typical set_pte_range() in finish_fault. > > > > Signed-off-by: Alistair Popple <apopple@nvidia.com> > > Suggested-by: Dan Williams <dan.j.williams@intel.com> > > > > --- > > > > Changes for v5: > > - Minor comment/formatting fixes suggested by David Hildenbrand > > > > Changes since v2: > > > > - New patch split out from "mm/memory: Add dax_insert_pfn" > > --- > > mm/memory.c | 37 +++++++++++++++++++++++++++++-------- > > 1 file changed, 29 insertions(+), 8 deletions(-) > > > > diff --git a/mm/memory.c b/mm/memory.c > > index 06bb29e..8531acb 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -2126,19 +2126,40 @@ static int validate_page_before_insert(struct vm_area_struct *vma, > > } > > > > static int insert_page_into_pte_locked(struct vm_area_struct *vma, pte_t *pte, > > - unsigned long addr, struct page *page, pgprot_t prot) > > + unsigned long addr, struct page *page, > > + pgprot_t prot, bool mkwrite) > > { > > struct folio *folio = page_folio(page); > > + pte_t entry = ptep_get(pte); > > pte_t pteval; > > > > - if (!pte_none(ptep_get(pte))) > > - return -EBUSY; > > + if (!pte_none(entry)) { > > + if (!mkwrite) > > + return -EBUSY; > > + > > + /* see insert_pfn(). */ > > + if (pte_pfn(entry) != page_to_pfn(page)) { > > + WARN_ON_ONCE(!is_zero_pfn(pte_pfn(entry))); > > + return -EFAULT; > > + } > > + entry = maybe_mkwrite(entry, vma); > > + entry = pte_mkyoung(entry); > > + if (ptep_set_access_flags(vma, addr, pte, entry, 1)) > > + update_mmu_cache(vma, addr, pte); > > + return 0; > > + } > > This hunk feels like it is begging to be unified with insert_pfn() after > pfn_t dies. Perhaps a TODO to remember to come back and unify them, or > you can go append that work to your pfn_t removal series? No one has complained about removing pfn_t so I do intend to clean that series up once this has all been merged somewhere, so I will just go append this work there. > Other than that you can add: > > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
diff --git a/mm/memory.c b/mm/memory.c index 06bb29e..8531acb 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2126,19 +2126,40 @@ static int validate_page_before_insert(struct vm_area_struct *vma, } static int insert_page_into_pte_locked(struct vm_area_struct *vma, pte_t *pte, - unsigned long addr, struct page *page, pgprot_t prot) + unsigned long addr, struct page *page, + pgprot_t prot, bool mkwrite) { struct folio *folio = page_folio(page); + pte_t entry = ptep_get(pte); pte_t pteval; - if (!pte_none(ptep_get(pte))) - return -EBUSY; + if (!pte_none(entry)) { + if (!mkwrite) + return -EBUSY; + + /* see insert_pfn(). */ + if (pte_pfn(entry) != page_to_pfn(page)) { + WARN_ON_ONCE(!is_zero_pfn(pte_pfn(entry))); + return -EFAULT; + } + entry = maybe_mkwrite(entry, vma); + entry = pte_mkyoung(entry); + if (ptep_set_access_flags(vma, addr, pte, entry, 1)) + update_mmu_cache(vma, addr, pte); + return 0; + } + /* Ok, finally just insert the thing.. */ pteval = mk_pte(page, prot); if (unlikely(is_zero_folio(folio))) { pteval = pte_mkspecial(pteval); } else { folio_get(folio); + entry = mk_pte(page, prot); + if (mkwrite) { + entry = pte_mkyoung(entry); + entry = maybe_mkwrite(pte_mkdirty(entry), vma); + } inc_mm_counter(vma->vm_mm, mm_counter_file(folio)); folio_add_file_rmap_pte(folio, page, vma); } @@ -2147,7 +2168,7 @@ static int insert_page_into_pte_locked(struct vm_area_struct *vma, pte_t *pte, } static int insert_page(struct vm_area_struct *vma, unsigned long addr, - struct page *page, pgprot_t prot) + struct page *page, pgprot_t prot, bool mkwrite) { int retval; pte_t *pte; @@ -2160,7 +2181,7 @@ static int insert_page(struct vm_area_struct *vma, unsigned long addr, pte = get_locked_pte(vma->vm_mm, addr, &ptl); if (!pte) goto out; - retval = insert_page_into_pte_locked(vma, pte, addr, page, prot); + retval = insert_page_into_pte_locked(vma, pte, addr, page, prot, mkwrite); pte_unmap_unlock(pte, ptl); out: return retval; @@ -2174,7 +2195,7 @@ static int insert_page_in_batch_locked(struct vm_area_struct *vma, pte_t *pte, err = validate_page_before_insert(vma, page); if (err) return err; - return insert_page_into_pte_locked(vma, pte, addr, page, prot); + return insert_page_into_pte_locked(vma, pte, addr, page, prot, false); } /* insert_pages() amortizes the cost of spinlock operations @@ -2310,7 +2331,7 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr, BUG_ON(vma->vm_flags & VM_PFNMAP); vm_flags_set(vma, VM_MIXEDMAP); } - return insert_page(vma, addr, page, vma->vm_page_prot); + return insert_page(vma, addr, page, vma->vm_page_prot, false); } EXPORT_SYMBOL(vm_insert_page); @@ -2590,7 +2611,7 @@ static vm_fault_t __vm_insert_mixed(struct vm_area_struct *vma, * result in pfn_t_has_page() == false. */ page = pfn_to_page(pfn_t_to_pfn(pfn)); - err = insert_page(vma, addr, page, pgprot); + err = insert_page(vma, addr, page, pgprot, mkwrite); } else { return insert_pfn(vma, addr, pfn, pgprot, mkwrite); }
In preparation for using insert_page() for DAX, enhance insert_page_into_pte_locked() to handle establishing writable mappings. Recall that DAX returns VM_FAULT_NOPAGE after installing a PTE which bypasses the typical set_pte_range() in finish_fault. Signed-off-by: Alistair Popple <apopple@nvidia.com> Suggested-by: Dan Williams <dan.j.williams@intel.com> --- Changes for v5: - Minor comment/formatting fixes suggested by David Hildenbrand Changes since v2: - New patch split out from "mm/memory: Add dax_insert_pfn" --- mm/memory.c | 37 +++++++++++++++++++++++++++++-------- 1 file changed, 29 insertions(+), 8 deletions(-)