mbox series

[0/3] hugetlbfs: use i_mmap_rwsem for better synchronization

Message ID 20181203200850.6460-1-mike.kravetz@oracle.com (mailing list archive)
Headers show
Series hugetlbfs: use i_mmap_rwsem for better synchronization | expand

Message

Mike Kravetz Dec. 3, 2018, 8:08 p.m. UTC
These patches are a follow up to the RFC,
http://lkml.kernel.org/r/20181024045053.1467-1-mike.kravetz@oracle.com
Comments made by Naoya were addressed.

There are two primary issues addressed here:
1) For shared pmds, huge PE pointers returned by huge_pte_alloc can become
   invalid via a call to huge_pmd_unshare by another thread.
2) hugetlbfs page faults can race with truncation causing invalid global
   reserve counts and state.
Both issues are addressed by expanding the use of i_mmap_rwsem.

These issues have existed for a long time.  They can be recreated with a
test program that causes page fault/truncation races.  For simple mappings,
this results in a negative HugePages_Rsvd count.  If racing with mappings
that contain shared pmds, we can hit "BUG at fs/hugetlbfs/inode.c:444!" or
Oops! as the result of an invalid memory reference.

I broke up the larger RFC into separate patches addressing each issue.
Hopefully, this is easier to understand/review.

Mike Kravetz (3):
  hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization
  hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race
  hugetlbfs: remove unnecessary code after i_mmap_rwsem synchronization

 fs/hugetlbfs/inode.c | 50 +++++++++----------------
 mm/hugetlb.c         | 87 +++++++++++++++++++++++++++++++-------------
 mm/memory-failure.c  | 14 ++++++-
 mm/migrate.c         | 13 ++++++-
 mm/rmap.c            |  3 ++
 mm/userfaultfd.c     | 11 +++++-
 6 files changed, 116 insertions(+), 62 deletions(-)

Comments

Andrew Morton Dec. 14, 2018, 9:22 p.m. UTC | #1
On Mon,  3 Dec 2018 12:08:47 -0800 Mike Kravetz <mike.kravetz@oracle.com> wrote:

> These patches are a follow up to the RFC,
> http://lkml.kernel.org/r/20181024045053.1467-1-mike.kravetz@oracle.com
> Comments made by Naoya were addressed.
> 
> There are two primary issues addressed here:
> 1) For shared pmds, huge PE pointers returned by huge_pte_alloc can become
>    invalid via a call to huge_pmd_unshare by another thread.
> 2) hugetlbfs page faults can race with truncation causing invalid global
>    reserve counts and state.
> Both issues are addressed by expanding the use of i_mmap_rwsem.
> 
> These issues have existed for a long time.  They can be recreated with a
> test program that causes page fault/truncation races.  For simple mappings,
> this results in a negative HugePages_Rsvd count.  If racing with mappings
> that contain shared pmds, we can hit "BUG at fs/hugetlbfs/inode.c:444!" or
> Oops! as the result of an invalid memory reference.
> 
> I broke up the larger RFC into separate patches addressing each issue.
> Hopefully, this is easier to understand/review.

Three patches tagged for -stable and no reviewers yet.  Could people
please take a close look?