Message ID | 20250120013038.6657-1-ioworker0@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [RFC,1/1] mm/madvise: fail MADV_PAGEOUT on VM_DROPPABLE VMA | expand |
On Mon, Jan 20, 2025 at 2:31 PM Lance Yang <ioworker0@gmail.com> wrote: > > MADV_PAGEOUT should fail on VMAs with the VM_DROPPABLE flag. While > MADV_PAGEOUT is intended to move anonymous pages to swap, VM_DROPPABLE > should not be swapped out. > > There is an issue where using MADV_PAGEOUT on a VMA with the VM_DROPPABLE > flag behaves like MADV_DONTNEED, causing the pages to be dropped. This > could break the semantics of MADV_PAGEOUT, IMO. > > So, let's add a check to detect the VM_DROPPABLE flag before doing > MADV_PAGEOUT and returns -EINVAL. > > Fixes: 9651fcedf7b9 ("mm: add MAP_DROPPABLE for designating always lazily freeable mappings") > Signed-off-by: Mingzhe Yang <mingzhe.yang@ly.com> > Signed-off-by: Lance Yang <ioworker0@gmail.com> I am not convinced this is a correct patch. try_to_unmap_one() will drop the folio. madvise_pageout(vma, prev, start, end) will still go to shrink_folio_list() and try_to_unmap_one(). > --- > mm/madvise.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/madvise.c b/mm/madvise.c > index 49f3a75046f6..29d0234da8a1 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -1263,6 +1263,8 @@ static int madvise_vma_behavior(struct vm_area_struct *vma, > case MADV_COLD: > return madvise_cold(vma, prev, start, end); > case MADV_PAGEOUT: > + if (vma->vm_flags & VM_DROPPABLE) > + return -EINVAL; > return madvise_pageout(vma, prev, start, end); > case MADV_FREE: > case MADV_DONTNEED: > -- > 2.45.2 > Thanks Barry
On 20.01.25 02:30, Lance Yang wrote: > MADV_PAGEOUT should fail on VMAs with the VM_DROPPABLE flag. While > MADV_PAGEOUT is intended to move anonymous pages to swap, VM_DROPPABLE > should not be swapped out. > > There is an issue where using MADV_PAGEOUT on a VMA with the VM_DROPPABLE > flag behaves like MADV_DONTNEED, causing the pages to be dropped. This > could break the semantics of MADV_PAGEOUT, IMO. It behaves exactly correct I think. They can be dropped any time, which includes on misplaced MADV_PAGEOUT.
Sorry but NACK again on this :( On Mon, Jan 20, 2025 at 09:30:38AM +0800, Lance Yang wrote: > MADV_PAGEOUT should fail on VMAs with the VM_DROPPABLE flag. While > MADV_PAGEOUT is intended to move anonymous pages to swap, VM_DROPPABLE > should not be swapped out. > > There is an issue where using MADV_PAGEOUT on a VMA with the VM_DROPPABLE > flag behaves like MADV_DONTNEED, causing the pages to be dropped. This > could break the semantics of MADV_PAGEOUT, IMO. > > So, let's add a check to detect the VM_DROPPABLE flag before doing > MADV_PAGEOUT and returns -EINVAL. No, let's not. Firstly this behaviour, whether you feel it is right or not, is now _established userland behaviour_. A pedantic interpretation of how MADV_PAGEOUT ought to interact with MAP_DROPPABLE doesn't trump what is already in released kernel versions, unless it is wrong in a -broken- fashion. Also I'd say you'd 100% expect an anon page to be dropped in the case of MAP_DROPPABLE, the semantics of which are 'if there is a request to drop this, drop it'. Here, the user is saying 'hey drop this'. So we drop it :) I think the more correct patch would be to extend the man page to explicitly mention MAP_DROPPABLE. > > Fixes: 9651fcedf7b9 ("mm: add MAP_DROPPABLE for designating always lazily freeable mappings") > Signed-off-by: Mingzhe Yang <mingzhe.yang@ly.com> > Signed-off-by: Lance Yang <ioworker0@gmail.com> > --- > mm/madvise.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/madvise.c b/mm/madvise.c > index 49f3a75046f6..29d0234da8a1 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -1263,6 +1263,8 @@ static int madvise_vma_behavior(struct vm_area_struct *vma, > case MADV_COLD: > return madvise_cold(vma, prev, start, end); > case MADV_PAGEOUT: > + if (vma->vm_flags & VM_DROPPABLE) > + return -EINVAL; > return madvise_pageout(vma, prev, start, end); > case MADV_FREE: > case MADV_DONTNEED: > -- > 2.45.2 >
diff --git a/mm/madvise.c b/mm/madvise.c index 49f3a75046f6..29d0234da8a1 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1263,6 +1263,8 @@ static int madvise_vma_behavior(struct vm_area_struct *vma, case MADV_COLD: return madvise_cold(vma, prev, start, end); case MADV_PAGEOUT: + if (vma->vm_flags & VM_DROPPABLE) + return -EINVAL; return madvise_pageout(vma, prev, start, end); case MADV_FREE: case MADV_DONTNEED: