diff mbox series

[v1,11/12] mm/rmap: handle device-exclusive entries correctly in page_vma_mkclean_one()

Message ID 20250129115411.2077152-12-david@redhat.com (mailing list archive)
State New
Headers show
Series mm: fixes for device-exclusive entries (hmm) | expand

Commit Message

David Hildenbrand Jan. 29, 2025, 11:54 a.m. UTC
Ever since commit b756a3b5e7ea ("mm: device exclusive memory access")
we can return with a device-exclusive entry from page_vma_mapped_walk().

page_vma_mkclean_one() is not prepared for that, so teach it about these
non-present nonswap PTEs.

We'll likely never hit that path with device-private entries, but we
could with device-exclusive ones.

It's not really clear what to do: the device could be accessing this
PTE, but we don't have that information in the PTE. Likely MMU notifiers
should be taking care of that, and we can just assume "not writable and
not dirty from CPU perspective".

Note that we could currently only run into this case with
device-exclusive entries on THPs. We still adjust the mapcount on
conversion to device-exclusive, making the rmap walk
abort early (folio_mapcount() == 0) for order-0 folios. We'll fix that
next, now that page_vma_mkclean_one() can handle it.

Fixes: b756a3b5e7ea ("mm: device exclusive memory access")
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/rmap.c | 8 ++++++++
 1 file changed, 8 insertions(+)
diff mbox series

Patch

diff --git a/mm/rmap.c b/mm/rmap.c
index 77b063e9aec4..9e2002d97d6f 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1050,6 +1050,14 @@  static int page_vma_mkclean_one(struct page_vma_mapped_walk *pvmw)
 			pte_t *pte = pvmw->pte;
 			pte_t entry = ptep_get(pte);
 
+			/*
+			 * We can end up here with selected non-swap entries
+			 * that actually map pages similar to PROT_NONE; see
+			 * page_vma_mapped_walk()->check_pte(). From a CPU
+			 * perspective, these PTEs are clean and not writable.
+			 */
+			if (!pte_present(entry))
+				continue;
 			if (!pte_dirty(entry) && !pte_write(entry))
 				continue;