Message ID | d7a6c9822ddc945daaf4f9db34d3f2b1c0454447.1736488799.git-series.apopple@nvidia.com |
---|---|
State | New |
Headers | show |
Series | fs/dax: Fix ZONE_DEVICE page reference counts | expand |
Alistair Popple wrote: > The procfs mmu files such as smaps and pagemap currently ignore devdax and > fsdax pages because these pages are considered special. A future change > will start treating these as normal pages, meaning they can be exposed via > smaps and pagemap. > > The only difference is that devdax and fsdax pages can never be pinned for > DMA via FOLL_LONGTERM, so add an explicit check in pte_is_pinned() to > reflect that. I don't understand this patch. pin_user_pages() is also used for Direct-I/O page pinning, so the comment about FOLL_LONGTERM is wrong, and I otherwise do not understand what goes wrong if the only pte_is_pinned() user correctly detects the pin state?
On 14.01.25 03:28, Dan Williams wrote: > Alistair Popple wrote: >> The procfs mmu files such as smaps and pagemap currently ignore devdax and >> fsdax pages because these pages are considered special. A future change >> will start treating these as normal pages, meaning they can be exposed via >> smaps and pagemap. >> >> The only difference is that devdax and fsdax pages can never be pinned for >> DMA via FOLL_LONGTERM, so add an explicit check in pte_is_pinned() to >> reflect that. > > I don't understand this patch. This whole pte_is_pinned() should likely be ripped out (and I have a patch here to do that for a long time). But that's a different discussion. > > pin_user_pages() is also used for Direct-I/O page pinning, so the > comment about FOLL_LONGTERM is wrong, and I otherwise do not understand > what goes wrong if the only pte_is_pinned() user correctly detects the > pin state? Yes, this patch should likely just be dropped. Even if folio_maybe_dma_pinned() == true because of "false positives", it will behave just like other order-0 pages with false positives, and only affect soft-dirty tracking ... which nobody should be caring about here at all. We would always detect the PTE as soft-dirty because we we never pte_wrprotect(old_pte) Yes, nobody should care.
On Tue, Jan 14, 2025 at 05:45:46PM +0100, David Hildenbrand wrote: > On 14.01.25 03:28, Dan Williams wrote: > > Alistair Popple wrote: > > > The procfs mmu files such as smaps and pagemap currently ignore devdax and > > > fsdax pages because these pages are considered special. A future change > > > will start treating these as normal pages, meaning they can be exposed via > > > smaps and pagemap. > > > > > > The only difference is that devdax and fsdax pages can never be pinned for > > > DMA via FOLL_LONGTERM, so add an explicit check in pte_is_pinned() to > > > reflect that. > > > > I don't understand this patch. > > > This whole pte_is_pinned() should likely be ripped out (and I have a patch > here to do that for a long time). Agreed. > But that's a different discussion. > > > > > pin_user_pages() is also used for Direct-I/O page pinning, so the > > comment about FOLL_LONGTERM is wrong, and I otherwise do not understand > > what goes wrong if the only pte_is_pinned() user correctly detects the > > pin state? > > Yes, this patch should likely just be dropped. Yeah, I think I was just being overly cautious about the change to vm_normal_page(). Agree this can be dropped. Looking at task_mmu.c there is one other user of vm_normal_page() - clear_refs_pte_range(). We will start clearing access/referenced bits on DAX PTEs there. But I think that's actually the right thing to do given we do currently clear them for PMD mapped DAX pages. > Even if folio_maybe_dma_pinned() == true because of "false positives", it > will behave just like other order-0 pages with false positives, and only > affect soft-dirty tracking ... which nobody should be caring about here at > all. > > We would always detect the PTE as soft-dirty because we we never > pte_wrprotect(old_pte) > > Yes, nobody should care. > > -- > Cheers, > > David / dhildenb >
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 38a5a3e..9a8a7d3 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1378,7 +1378,7 @@ static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr, if (likely(!test_bit(MMF_HAS_PINNED, &vma->vm_mm->flags))) return false; folio = vm_normal_folio(vma, addr, pte); - if (!folio) + if (!folio || folio_is_devdax(folio) || folio_is_fsdax(folio)) return false; return folio_maybe_dma_pinned(folio); }
The procfs mmu files such as smaps and pagemap currently ignore devdax and fsdax pages because these pages are considered special. A future change will start treating these as normal pages, meaning they can be exposed via smaps and pagemap. The only difference is that devdax and fsdax pages can never be pinned for DMA via FOLL_LONGTERM, so add an explicit check in pte_is_pinned() to reflect that. Signed-off-by: Alistair Popple <apopple@nvidia.com> --- Changes for v5: - After discussion with David remove the checks for DAX pages for smaps and pagemap walkers. This means DAX pages will now appear in those procfs files. --- fs/proc/task_mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)