Message ID | 20210726154932.102880-1-david@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v1] mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ|WRITE) | expand |
Hi Andrew, sorry for not CCing you, absolutely no clue why I accidentally dropped you. Can you give this patch a churn? It would be great if we could get that into 5.14, so we don't have to deal with differing behavior between Linux versions. Cheers! On 26.07.21 17:49, David Hildenbrand wrote: > Doing some extended tests and polishing the man page update for > MADV_POPULATE_(READ|WRITE), I realized that we end up converting also > SIGBUS (via -EFAULT) to -EINVAL, making it look like yet another > madvise() user error. > > We want to report only problematic mappings and permission problems that > the user could have know as -EINVAL. > > Let's not convert -EFAULT arising due to SIGBUS (or SIGSEGV) to > -EINVAL, but instead indicate -EFAULT to user space. While we could also > convert it to -ENOMEM, using -EFAULT looks more helpful when user space > might want to troubleshoot what's going wrong: MADV_POPULATE_(READ|WRITE) > is not part of an final Linux release and we can still adjust the behavior. > > Fixes: 4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables") > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Oscar Salvador <osalvador@suse.de> > Cc: Matthew Wilcox (Oracle) <willy@infradead.org> > Cc: Andrea Arcangeli <aarcange@redhat.com> > Cc: Minchan Kim <minchan@kernel.org> > Cc: Jann Horn <jannh@google.com> > Cc: Jason Gunthorpe <jgg@ziepe.ca> > Cc: Dave Hansen <dave.hansen@intel.com> > Cc: Hugh Dickins <hughd@google.com> > Cc: Rik van Riel <riel@surriel.com> > Cc: Michael S. Tsirkin <mst@redhat.com> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Richard Henderson <rth@twiddle.net> > Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> > Cc: Matt Turner <mattst88@gmail.com> > Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> > Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> > Cc: Helge Deller <deller@gmx.de> > Cc: Chris Zankel <chris@zankel.net> > Cc: Max Filippov <jcmvbkbc@gmail.com> > Cc: Mike Kravetz <mike.kravetz@oracle.com> > Cc: Peter Xu <peterx@redhat.com> > Cc: Rolf Eike Beer <eike-kernel@sf-tec.de> > Cc: Ram Pai <linuxram@us.ibm.com> > Cc: Shuah Khan <shuah@kernel.org> > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > mm/gup.c | 7 +++++-- > mm/madvise.c | 4 +++- > 2 files changed, 8 insertions(+), 3 deletions(-) > > diff --git a/mm/gup.c b/mm/gup.c > index 42b8b1fa6521..b94717977d17 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -1558,9 +1558,12 @@ long faultin_vma_page_range(struct vm_area_struct *vma, unsigned long start, > gup_flags |= FOLL_WRITE; > > /* > - * See check_vma_flags(): Will return -EFAULT on incompatible mappings > - * or with insufficient permissions. > + * We want to report -EINVAL instead of -EFAULT for any permission > + * problems or incompatible mappings. > */ > + if (check_vma_flags(vma, gup_flags)) > + return -EINVAL; > + > return __get_user_pages(mm, start, nr_pages, gup_flags, > NULL, NULL, locked); > } > diff --git a/mm/madvise.c b/mm/madvise.c > index 6d3d348b17f4..5c065bc8b5f6 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -862,10 +862,12 @@ static long madvise_populate(struct vm_area_struct *vma, > switch (pages) { > case -EINTR: > return -EINTR; > - case -EFAULT: /* Incompatible mappings / permissions. */ > + case -EINVAL: /* Incompatible mappings / permissions. */ > return -EINVAL; > case -EHWPOISON: > return -EHWPOISON; > + case -EFAULT: /* VM_FAULT_SIGBUS or VM_FAULT_SIGSEGV */ > + return -EFAULT; > default: > pr_warn_once("%s: unhandled return value: %ld\n", > __func__, pages); >
diff --git a/mm/gup.c b/mm/gup.c index 42b8b1fa6521..b94717977d17 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1558,9 +1558,12 @@ long faultin_vma_page_range(struct vm_area_struct *vma, unsigned long start, gup_flags |= FOLL_WRITE; /* - * See check_vma_flags(): Will return -EFAULT on incompatible mappings - * or with insufficient permissions. + * We want to report -EINVAL instead of -EFAULT for any permission + * problems or incompatible mappings. */ + if (check_vma_flags(vma, gup_flags)) + return -EINVAL; + return __get_user_pages(mm, start, nr_pages, gup_flags, NULL, NULL, locked); } diff --git a/mm/madvise.c b/mm/madvise.c index 6d3d348b17f4..5c065bc8b5f6 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -862,10 +862,12 @@ static long madvise_populate(struct vm_area_struct *vma, switch (pages) { case -EINTR: return -EINTR; - case -EFAULT: /* Incompatible mappings / permissions. */ + case -EINVAL: /* Incompatible mappings / permissions. */ return -EINVAL; case -EHWPOISON: return -EHWPOISON; + case -EFAULT: /* VM_FAULT_SIGBUS or VM_FAULT_SIGSEGV */ + return -EFAULT; default: pr_warn_once("%s: unhandled return value: %ld\n", __func__, pages);
Doing some extended tests and polishing the man page update for MADV_POPULATE_(READ|WRITE), I realized that we end up converting also SIGBUS (via -EFAULT) to -EINVAL, making it look like yet another madvise() user error. We want to report only problematic mappings and permission problems that the user could have know as -EINVAL. Let's not convert -EFAULT arising due to SIGBUS (or SIGSEGV) to -EINVAL, but instead indicate -EFAULT to user space. While we could also convert it to -ENOMEM, using -EFAULT looks more helpful when user space might want to troubleshoot what's going wrong: MADV_POPULATE_(READ|WRITE) is not part of an final Linux release and we can still adjust the behavior. Fixes: 4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables") Cc: Arnd Bergmann <arnd@arndb.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@surriel.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Peter Xu <peterx@redhat.com> Cc: Rolf Eike Beer <eike-kernel@sf-tec.de> Cc: Ram Pai <linuxram@us.ibm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: David Hildenbrand <david@redhat.com> --- mm/gup.c | 7 +++++-- mm/madvise.c | 4 +++- 2 files changed, 8 insertions(+), 3 deletions(-)