mbox series

[v4,0/8] make hugetlb put_page safe for all calling contexts

Message ID 20210405230043.182734-1-mike.kravetz@oracle.com (mailing list archive)
Headers show
Series make hugetlb put_page safe for all calling contexts | expand

Message

Mike Kravetz April 5, 2021, 11 p.m. UTC
IMPORTANT NOTE FOR REVIEWERS:  v3 of this series is in Andrew's mmotm
tree v5.12-rc5-mmotm-2021-03-31-21-27.  Muchun Song noticed that Oscar
Salvador's series "Make alloc_contig_range handle Hugetlb pages" was also
added to that mmotm version.  v3 of this series did not take Oscar's
changes into account.  v4 addresses those omissions.

v4 is based on the following:
- Start with v5.12-rc5-mmotm-2021-03-31-21-27
- Revert v3 of this series

Patch changes from v3:
1	- Trivial context fixup due to cma changes.  No functional changes
2, 3	- No change
4, 5	- Changes required due to "Make alloc_contig_range handle
	  Hugetlb pages".  Specifically, alloc_and_dissolve_huge_page
	  changes are in need of review.
6	- No change
7	- Fairly straight forward spin_*lock calls to spin_*lock_irq*
	  changes in code from "Make alloc_contig_range handle Hugetlb pages"
	  series.
8	- Trivial change due to context changes.  No functional change.

If easier to review, I could also provide delta patches on top of
patches 4, 5, and 7 of v3.

Original cover letter follows:
This effort is the result a recent bug report [1].  Syzbot found a
potential deadlock in the hugetlb put_page/free_huge_page_path.
WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
Since the free_huge_page_path already has code to 'hand off' page
free requests to a workqueue, a suggestion was proposed to make
the in_irq() detection accurate by always enabling PREEMPT_COUNT [2].
The outcome of that discussion was that the hugetlb put_page path
(free_huge_page) path should be properly fixed and safe for all calling
contexts.

This series is based on v5.12-rc3-mmotm-2021-03-17-22-24.  At a high
level, the series provides:
- Patches 1 & 2 change CMA bitmap mutex to an irq safe spinlock
- Patch 3 adds a mutex for proc/sysfs interfaces changing hugetlb counts
- Patches 4, 5 & 6 are aimed at reducing lock hold times.  To be clear
  the goal is to eliminate single lock hold times of a long duration.
  Overall lock hold time is not addressed.
- Patch 7 makes hugetlb_lock and subpool lock IRQ safe.  It also reverts
  the code which defers calls to a workqueue if !in_task.
- Patch 8 adds some lockdep_assert_held() calls

[1] https://lore.kernel.org/linux-mm/000000000000f1c03b05bc43aadc@google.com/
[2] http://lkml.kernel.org/r/20210311021321.127500-1-mike.kravetz@oracle.com

v3 -> v4
- Add changes needed for the series "Make alloc_contig_range handle
  Hugetlb pages"

v2 -> v3
- Update commit message in patch 1 as suggested by Michal
- Do not use spin_lock_irqsave/spin_unlock_irqrestore when we know we
  are in task context as suggested by Michal
- Remove unnecessary INIT_LIST_HEAD() as suggested by Muchun

v1 -> v2
- Drop Roman's cma_release_nowait() patches and just change CMA mutex
  to an IRQ safe spinlock.
- Cleanups to variable names, commets and commit messages as suggested
  by Michal, Oscar, Miaohe and Muchun.
- Dropped unnecessary INIT_LIST_HEAD as suggested by Michal and list_del
  as suggested by Muchun.
- Created update_and_free_pages_bulk helper as suggested by Michal.
- Rebased on v5.12-rc4-mmotm-2021-03-28-16-37
- Added Acked-by: and Reviewed-by: from v1

RFC -> v1
- Add Roman's cma_release_nowait() patches.  This eliminated the need
  to do a workqueue handoff in hugetlb code.
- Use Michal's suggestion to batch pages for freeing.  This eliminated
  the need to recalculate loop control variables when dropping the lock.
- Added lockdep_assert_held() calls
- Rebased to v5.12-rc3-mmotm-2021-03-17-22-24


Mike Kravetz (8):
  mm/cma: change cma mutex to irq safe spinlock
  hugetlb: no need to drop hugetlb_lock to call cma_release
  hugetlb: add per-hstate mutex to synchronize user adjustments
  hugetlb: create remove_hugetlb_page() to separate functionality
  hugetlb: call update_and_free_page without hugetlb_lock
  hugetlb: change free_pool_huge_page to remove_pool_huge_page
  hugetlb: make free_huge_page irq safe
  hugetlb: add lockdep_assert_held() calls for hugetlb_lock

 include/linux/hugetlb.h |   1 +
 mm/cma.c                |  18 +-
 mm/cma.h                |   2 +-
 mm/cma_debug.c          |   8 +-
 mm/hugetlb.c            | 384 +++++++++++++++++++++-------------------
 mm/hugetlb_cgroup.c     |   8 +-
 6 files changed, 218 insertions(+), 203 deletions(-)

Comments

Mike Kravetz April 8, 2021, 12:56 a.m. UTC | #1
Hello Andrew,

It has been suggested that this series be included before Oscar Salvador's
series "Make alloc_contig_range handle Hugetlb pages".  At a logical
level, here is what I think needs to happen.  However, I am not sure how
you do tree management and I am open to anything you suggest.  Please do
not start until we get an Ack from Oscar as he will need to participate.

Remove patches for this series in your tree from Mike Kravetz:
- hugetlb: add lockdep_assert_held() calls for hugetlb_lock
- hugetlb: fix irq locking omissions
- hugetlb: make free_huge_page irq safe
- hugetlb: change free_pool_huge_page to remove_pool_huge_page
- hugetlb: call update_and_free_page without hugetlb_lock
- hugetlb: create remove_hugetlb_page() to separate functionality
  /*
   * Technically, the following patches do not need to be removed as
   * they do not interact with Oscar's changes.  However, they do
   * contain 'cover letter comments' in the commit messages which may
   * not make sense out of context.
   */
- hugetlb: add per-hstate mutex to synchronize user adjustment
- hugetlb: no need to drop hugetlb_lock to call cma_release
- mm/cma: change cma mutex to irq safe spinlock

Remove patches for the series "Make alloc_contig_range handle Hugetlb pages"
from Oscar Salvador.
- mm,page_alloc: drop unnecessary checks from pfn_range_valid_contig
- mm: make alloc_contig_range handle in-use hugetlb pages
- mm: make alloc_contig_range handle free hugetlb pages
  /*
   * Technically, the following patches do not need to be removed as
   * they do not interact with Mike's changes.  Again, they do
   * contain 'cover letter comments' in the commit messages which may
   * not make sense out of context.
   */
- mmcompaction-let-isolate_migratepages_rangeblock-return-error-codes-fix
- mm,compaction: let isolate_migratepages_{range,block} return error codes
- mm,page_alloc: bail out earlier on -ENOMEM in alloc_contig_migrate_range

After removing patches above, Mike will provide updated versions of:
/* If removed above */
- mm/cma: change cma mutex to irq safe spinlock
- hugetlb: no need to drop hugetlb_lock to call cma_release
- hugetlb: add per-hstate mutex to synchronize user adjustment
/* end of If removed above */
- hugetlb: create remove_hugetlb_page() to separate functionality
- hugetlb: call update_and_free_page without hugetlb_lock
- hugetlb: change free_pool_huge_page to remove_pool_huge_page
- hugetlb: make free_huge_page irq safe
- hugetlb: add lockdep_assert_held() calls for hugetlb_lock

With these patches in place, Oscar will provide updated versions of:
/* If removed above */
- mm,page_alloc: bail out earlier on -ENOMEM in alloc_contig_migrate_range
- mm,compaction: let isolate_migratepages_{range,block} return error codes
/* end of If removed above */
- mm: make alloc_contig_range handle free hugetlb pages
- mm: make alloc_contig_range handle in-use hugetlb pages
- mm,page_alloc: drop unnecessary checks from pfn_range_valid_contig

Sorry that things ended up in their current state as it will cause more
work for you.
Oscar Salvador April 8, 2021, 7:11 a.m. UTC | #2
On Wed, Apr 07, 2021 at 05:56:55PM -0700, Mike Kravetz wrote:
> Hello Andrew,
> 
> It has been suggested that this series be included before Oscar Salvador's
> series "Make alloc_contig_range handle Hugetlb pages".  At a logical
> level, here is what I think needs to happen.  However, I am not sure how
> you do tree management and I am open to anything you suggest.  Please do
> not start until we get an Ack from Oscar as he will need to participate.

As I said, this is fine by me.
I think it is the most straightforward way to proceed with this series
as this is a problem that has been bugging us fore quite some time now.

See below:

 
> Remove patches for the series "Make alloc_contig_range handle Hugetlb pages"
> from Oscar Salvador.
> - mm,page_alloc: drop unnecessary checks from pfn_range_valid_contig
> - mm: make alloc_contig_range handle in-use hugetlb pages
> - mm: make alloc_contig_range handle free hugetlb pages

Yes, those need to be removed

>   /*
>    * Technically, the following patches do not need to be removed as
>    * they do not interact with Mike's changes.  Again, they do
>    * contain 'cover letter comments' in the commit messages which may
>    * not make sense out of context.
>    */
> - mmcompaction-let-isolate_migratepages_rangeblock-return-error-codes-fix
> - mm,compaction: let isolate_migratepages_{range,block} return error codes

Those could stay as well, but they mention a change in
alloc_contig_range() and without the context of the whole patchset might
be misleading, so I would pull those out as well.

> - mm,page_alloc: bail out earlier on -ENOMEM in alloc_contig_migrate_range

I think this one can stay.

But if It is going to be easier for Andrew, just pull them all out and I
will resend the whole series once this work goes in.

Thanks!
Andrew Morton April 9, 2021, 5:05 a.m. UTC | #3
On Thu, 8 Apr 2021 09:11:30 +0200 Oscar Salvador <osalvador@suse.de> wrote:

> But if It is going to be easier for Andrew, just pull them all out and I
> will resend the whole series once this work goes in.

I think so.

I shall drop these:

mmpage_alloc-bail-out-earlier-on-enomem-in-alloc_contig_migrate_range.patch
mmcompaction-let-isolate_migratepages_rangeblock-return-error-codes.patch
mmcompaction-let-isolate_migratepages_rangeblock-return-error-codes-fix.patch
mm-make-alloc_contig_range-handle-free-hugetlb-pages.patch
mm-make-alloc_contig_range-handle-in-use-hugetlb-pages.patch
mmpage_alloc-drop-unnecessary-checks-from-pfn_range_valid_contig.patch

and these:

mm-cma-change-cma-mutex-to-irq-safe-spinlock.patch
hugetlb-no-need-to-drop-hugetlb_lock-to-call-cma_release.patch
hugetlb-add-per-hstate-mutex-to-synchronize-user-adjustments.patch
hugetlb-create-remove_hugetlb_page-to-separate-functionality.patch
hugetlb-call-update_and_free_page-without-hugetlb_lock.patch
hugetlb-change-free_pool_huge_page-to-remove_pool_huge_page.patch
hugetlb-make-free_huge_page-irq-safe.patch
hugetlb-make-free_huge_page-irq-safe-fix.patch
hugetlb-add-lockdep_assert_held-calls-for-hugetlb_lock.patch

Along with notes-to-self that this:

  https://lkml.kernel.org/r/YGwnPCPaq1xKh/88@hirez.programming.kicks-ass.net

might need attention and that this:

  hugetlb-make-free_huge_page-irq-safe.patch

might need updating.
Mike Kravetz April 9, 2021, 8:43 p.m. UTC | #4
On 4/8/21 10:05 PM, Andrew Morton wrote:
> On Thu, 8 Apr 2021 09:11:30 +0200 Oscar Salvador <osalvador@suse.de> wrote:
> 
>> But if It is going to be easier for Andrew, just pull them all out and I
>> will resend the whole series once this work goes in.
> 
> I think so.
> 
> I shall drop these:
> 
> mmpage_alloc-bail-out-earlier-on-enomem-in-alloc_contig_migrate_range.patch
> mmcompaction-let-isolate_migratepages_rangeblock-return-error-codes.patch
> mmcompaction-let-isolate_migratepages_rangeblock-return-error-codes-fix.patch
> mm-make-alloc_contig_range-handle-free-hugetlb-pages.patch
> mm-make-alloc_contig_range-handle-in-use-hugetlb-pages.patch
> mmpage_alloc-drop-unnecessary-checks-from-pfn_range_valid_contig.patch
> 
> and these:
> 
> mm-cma-change-cma-mutex-to-irq-safe-spinlock.patch
> hugetlb-no-need-to-drop-hugetlb_lock-to-call-cma_release.patch
> hugetlb-add-per-hstate-mutex-to-synchronize-user-adjustments.patch
> hugetlb-create-remove_hugetlb_page-to-separate-functionality.patch
> hugetlb-call-update_and_free_page-without-hugetlb_lock.patch
> hugetlb-change-free_pool_huge_page-to-remove_pool_huge_page.patch
> hugetlb-make-free_huge_page-irq-safe.patch
> hugetlb-make-free_huge_page-irq-safe-fix.patch
> hugetlb-add-lockdep_assert_held-calls-for-hugetlb_lock.patch
> 
> Along with notes-to-self that this:
> 
>   https://lkml.kernel.org/r/YGwnPCPaq1xKh/88@hirez.programming.kicks-ass.net
> 
> might need attention and that this:
> 
>   hugetlb-make-free_huge_page-irq-safe.patch
> 
> might need updating.
> 

Thank you Andrew!

I will send a v5 shortly based on dropping the above patch.