mbox series

[0/4] mm/hugetlb: Early cow on fork, and a few cleanups

Message ID 20210203210832.113685-1-peterx@redhat.com (mailing list archive)
Headers show
Series mm/hugetlb: Early cow on fork, and a few cleanups | expand

Message

Peter Xu Feb. 3, 2021, 9:08 p.m. UTC
As reported by Gal [1], we still miss the code clip to handle early cow for
hugetlb case, which is true.  Again, it still feels odd to fork() after using a
few huge pages, especially if they're privately mapped to me..  However I do
agree with Gal and Jason in that we should still have that since that'll
complete the early cow on fork effort at least, and it'll still fix issues
where buffers are not well under control and not easy to apply MADV_DONTFORK.

The first two patches (1-2) are some cleanups I noticed when reading into the
hugetlb reserve map code.  I think it's good to have but they're not necessary
for fixing the fork issue.

The last two patches (3-4) is the real fix.

I tested this with a fork() after some vfio-pci assignment, so I'm pretty sure
the page copy path could trigger well (page will be accounted right after the
fork()), but I didn't do data check since the card I assigned is some random
nic.  Gal, please feel free to try this if you have better way to verify the
series.

  https://github.com/xzpeter/linux/tree/fork-cow-pin-huge

Please review, thanks!

[1] https://lore.kernel.org/lkml/27564187-4a08-f187-5a84-3df50009f6ca@amazon.com/

Peter Xu (4):
  hugetlb: Dedup the code to add a new file_region
  hugetlg: Break earlier in add_reservation_in_range() when we can
  mm: Introduce page_needs_cow_for_dma() for deciding whether cow
  hugetlb: Do early cow when page pinned on src mm

 include/linux/mm.h |  21 ++++++++
 mm/huge_memory.c   |   8 +--
 mm/hugetlb.c       | 129 ++++++++++++++++++++++++++++++++++-----------
 mm/internal.h      |   5 --
 mm/memory.c        |   7 +--
 5 files changed, 123 insertions(+), 47 deletions(-)

Comments

Gal Pressman Feb. 4, 2021, 2:32 p.m. UTC | #1
On 03/02/2021 23:08, Peter Xu wrote:
> As reported by Gal [1], we still miss the code clip to handle early cow for
> 
> hugetlb case, which is true.  Again, it still feels odd to fork() after using a
> 
> few huge pages, especially if they're privately mapped to me..  However I do
> 
> agree with Gal and Jason in that we should still have that since that'll
> 
> complete the early cow on fork effort at least, and it'll still fix issues
> 
> where buffers are not well under control and not easy to apply MADV_DONTFORK.
> 
> 
> 
> The first two patches (1-2) are some cleanups I noticed when reading into the
> 
> hugetlb reserve map code.  I think it's good to have but they're not necessary
> 
> for fixing the fork issue.
> 
> 
> 
> The last two patches (3-4) is the real fix.
> 
> 
> 
> I tested this with a fork() after some vfio-pci assignment, so I'm pretty sure
> 
> the page copy path could trigger well (page will be accounted right after the
> 
> fork()), but I didn't do data check since the card I assigned is some random
> 
> nic.  Gal, please feel free to try this if you have better way to verify the
> 
> series.

Thanks Peter, once v2 is submitted I'll pull the patches and we'll run the tests
that discovered the issue to verify it works.