Message ID | 20250120013038.6657-1-ioworker0@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [RFC,1/1] mm/madvise: fail MADV_PAGEOUT on VM_DROPPABLE VMA | expand |
On Mon, Jan 20, 2025 at 2:31 PM Lance Yang <ioworker0@gmail.com> wrote: > > MADV_PAGEOUT should fail on VMAs with the VM_DROPPABLE flag. While > MADV_PAGEOUT is intended to move anonymous pages to swap, VM_DROPPABLE > should not be swapped out. > > There is an issue where using MADV_PAGEOUT on a VMA with the VM_DROPPABLE > flag behaves like MADV_DONTNEED, causing the pages to be dropped. This > could break the semantics of MADV_PAGEOUT, IMO. > > So, let's add a check to detect the VM_DROPPABLE flag before doing > MADV_PAGEOUT and returns -EINVAL. > > Fixes: 9651fcedf7b9 ("mm: add MAP_DROPPABLE for designating always lazily freeable mappings") > Signed-off-by: Mingzhe Yang <mingzhe.yang@ly.com> > Signed-off-by: Lance Yang <ioworker0@gmail.com> I am not convinced this is a correct patch. try_to_unmap_one() will drop the folio. madvise_pageout(vma, prev, start, end) will still go to shrink_folio_list() and try_to_unmap_one(). > --- > mm/madvise.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/madvise.c b/mm/madvise.c > index 49f3a75046f6..29d0234da8a1 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -1263,6 +1263,8 @@ static int madvise_vma_behavior(struct vm_area_struct *vma, > case MADV_COLD: > return madvise_cold(vma, prev, start, end); > case MADV_PAGEOUT: > + if (vma->vm_flags & VM_DROPPABLE) > + return -EINVAL; > return madvise_pageout(vma, prev, start, end); > case MADV_FREE: > case MADV_DONTNEED: > -- > 2.45.2 > Thanks Barry
On 20.01.25 02:30, Lance Yang wrote: > MADV_PAGEOUT should fail on VMAs with the VM_DROPPABLE flag. While > MADV_PAGEOUT is intended to move anonymous pages to swap, VM_DROPPABLE > should not be swapped out. > > There is an issue where using MADV_PAGEOUT on a VMA with the VM_DROPPABLE > flag behaves like MADV_DONTNEED, causing the pages to be dropped. This > could break the semantics of MADV_PAGEOUT, IMO. It behaves exactly correct I think. They can be dropped any time, which includes on misplaced MADV_PAGEOUT.
Sorry but NACK again on this :( On Mon, Jan 20, 2025 at 09:30:38AM +0800, Lance Yang wrote: > MADV_PAGEOUT should fail on VMAs with the VM_DROPPABLE flag. While > MADV_PAGEOUT is intended to move anonymous pages to swap, VM_DROPPABLE > should not be swapped out. > > There is an issue where using MADV_PAGEOUT on a VMA with the VM_DROPPABLE > flag behaves like MADV_DONTNEED, causing the pages to be dropped. This > could break the semantics of MADV_PAGEOUT, IMO. > > So, let's add a check to detect the VM_DROPPABLE flag before doing > MADV_PAGEOUT and returns -EINVAL. No, let's not. Firstly this behaviour, whether you feel it is right or not, is now _established userland behaviour_. A pedantic interpretation of how MADV_PAGEOUT ought to interact with MAP_DROPPABLE doesn't trump what is already in released kernel versions, unless it is wrong in a -broken- fashion. Also I'd say you'd 100% expect an anon page to be dropped in the case of MAP_DROPPABLE, the semantics of which are 'if there is a request to drop this, drop it'. Here, the user is saying 'hey drop this'. So we drop it :) I think the more correct patch would be to extend the man page to explicitly mention MAP_DROPPABLE. > > Fixes: 9651fcedf7b9 ("mm: add MAP_DROPPABLE for designating always lazily freeable mappings") > Signed-off-by: Mingzhe Yang <mingzhe.yang@ly.com> > Signed-off-by: Lance Yang <ioworker0@gmail.com> > --- > mm/madvise.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/madvise.c b/mm/madvise.c > index 49f3a75046f6..29d0234da8a1 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -1263,6 +1263,8 @@ static int madvise_vma_behavior(struct vm_area_struct *vma, > case MADV_COLD: > return madvise_cold(vma, prev, start, end); > case MADV_PAGEOUT: > + if (vma->vm_flags & VM_DROPPABLE) > + return -EINVAL; > return madvise_pageout(vma, prev, start, end); > case MADV_FREE: > case MADV_DONTNEED: > -- > 2.45.2 >
Hi Barry, David and Lorenzo, Thanks a lot for taking time to review! On Mon, Jan 20, 2025 at 6:45 PM Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > Sorry but NACK again on this :( > > On Mon, Jan 20, 2025 at 09:30:38AM +0800, Lance Yang wrote: > > MADV_PAGEOUT should fail on VMAs with the VM_DROPPABLE flag. While > > MADV_PAGEOUT is intended to move anonymous pages to swap, VM_DROPPABLE > > should not be swapped out. > > > > There is an issue where using MADV_PAGEOUT on a VMA with the VM_DROPPABLE > > flag behaves like MADV_DONTNEED, causing the pages to be dropped. This > > could break the semantics of MADV_PAGEOUT, IMO. > > > > So, let's add a check to detect the VM_DROPPABLE flag before doing > > MADV_PAGEOUT and returns -EINVAL. > > No, let's not. I think I completely got it wrong. Learning so much from your patient responses! > > Firstly this behaviour, whether you feel it is right or not, is now > _established userland behaviour_. A pedantic interpretation of how > MADV_PAGEOUT ought to interact with MAP_DROPPABLE doesn't trump what is > already in released kernel versions, unless it is wrong in a -broken- > fashion. Yeah, this patch would break the established userland behaviour as well. > > Also I'd say you'd 100% expect an anon page to be dropped in the case of > MAP_DROPPABLE, the semantics of which are 'if there is a request to drop > this, drop it'. Here, the user is saying 'hey drop this'. So we drop it :) Yep, you're right. > > I think the more correct patch would be to extend the man page to > explicitly mention MAP_DROPPABLE. Thanks again for your time! Lance > > > > > Fixes: 9651fcedf7b9 ("mm: add MAP_DROPPABLE for designating always lazily freeable mappings") > > Signed-off-by: Mingzhe Yang <mingzhe.yang@ly.com> > > Signed-off-by: Lance Yang <ioworker0@gmail.com> > > --- > > mm/madvise.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/mm/madvise.c b/mm/madvise.c > > index 49f3a75046f6..29d0234da8a1 100644 > > --- a/mm/madvise.c > > +++ b/mm/madvise.c > > @@ -1263,6 +1263,8 @@ static int madvise_vma_behavior(struct vm_area_struct *vma, > > case MADV_COLD: > > return madvise_cold(vma, prev, start, end); > > case MADV_PAGEOUT: > > + if (vma->vm_flags & VM_DROPPABLE) > > + return -EINVAL; > > return madvise_pageout(vma, prev, start, end); > > case MADV_FREE: > > case MADV_DONTNEED: > > -- > > 2.45.2 > >
diff --git a/mm/madvise.c b/mm/madvise.c index 49f3a75046f6..29d0234da8a1 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1263,6 +1263,8 @@ static int madvise_vma_behavior(struct vm_area_struct *vma, case MADV_COLD: return madvise_cold(vma, prev, start, end); case MADV_PAGEOUT: + if (vma->vm_flags & VM_DROPPABLE) + return -EINVAL; return madvise_pageout(vma, prev, start, end); case MADV_FREE: case MADV_DONTNEED: