mbox series

[0/9] disable pcplists during memory offline

Message ID 20200922143712.12048-1-vbabka@suse.cz (mailing list archive)
Headers show
Series disable pcplists during memory offline | expand

Message

Vlastimil Babka Sept. 22, 2020, 2:37 p.m. UTC
As per the discussions [1] [2] this is an attempt to implement David's
suggestion that page isolation should disable pcplists to avoid races with page
freeing in progress. This is done without extra checks in fast paths, as
explained in Patch 9. The repeated draining done by [2] is then no longer
needed. Previous version (RFC) is at [3].

The RFC tried to hide pcplists disabling/enabling into page isolation, but it
wasn't completely possible, as memory offline does not unisolation. Michal
suggested an explicit API in [4] so that's the current implementation and it
seems indeed nicer.

Once we accept that page isolation users need to do explicit actions around it
depending on the needed guarantees, we can also IMHO accept that the current
pcplist draining can be also done by the callers, which is more effective.
After all, there are only two users of page isolation. So patch 7 does
effectively the same thing as Pavel proposed in [5], and patches 8-9 implement
stronger guarantees only for memory offline. If CMA decides to opt-in to the
stronger guarantee, it's easy to do so.

Patches 1-6 are preparatory cleanups for pcplist disabling.

Patchset was briefly tested in QEMU so that memory online/offline works, but
I haven't done a stress test that would prove the race fixed by [2] is
eliminated.

Note that patch 9 could be avoided if we instead adjusted page freeing in shown
in [6], but I believe the current implementation of disabling pcplists is not
too much complex, so I would prefer this instead of adding new checks and
longer irq-disabled section into page freeing hotpaths.

[1] https://lore.kernel.org/linux-mm/20200901124615.137200-1-pasha.tatashin@soleen.com/
[2] https://lore.kernel.org/linux-mm/20200903140032.380431-1-pasha.tatashin@soleen.com/
[3] https://lore.kernel.org/linux-mm/20200907163628.26495-1-vbabka@suse.cz/
[4] https://lore.kernel.org/linux-mm/20200909113647.GG7348@dhcp22.suse.cz/
[5] https://lore.kernel.org/linux-mm/20200904151448.100489-3-pasha.tatashin@soleen.com/
[6] https://lore.kernel.org/linux-mm/3d3b53db-aeaa-ff24-260b-36427fac9b1c@suse.cz/

Vlastimil Babka (9):
  mm, page_alloc: clean up pageset high and batch update
  mm, page_alloc: calculate pageset high and batch once per zone
  mm, page_alloc: remove setup_pageset()
  mm, page_alloc: simplify pageset_update()
  mm, page_alloc: make per_cpu_pageset accessible only after init
  mm, page_alloc: cache pageset high and batch in struct zone
  mm, page_alloc: move draining pcplists to page isolation users
  mm, page_alloc: drain all pcplists during memory offline
  mm, page_alloc: optionally disable pcplists during page isolation

 include/linux/gfp.h            |   1 +
 include/linux/mmzone.h         |   8 ++
 include/linux/page-isolation.h |   2 +
 mm/internal.h                  |   4 +
 mm/memory_hotplug.c            |  27 +++--
 mm/page_alloc.c                | 190 ++++++++++++++++++---------------
 mm/page_isolation.c            |  26 ++++-
 7 files changed, 152 insertions(+), 106 deletions(-)

Comments

David Hildenbrand Sept. 22, 2020, 5:15 p.m. UTC | #1
On 22.09.20 16:37, Vlastimil Babka wrote:
> As per the discussions [1] [2] this is an attempt to implement David's
> suggestion that page isolation should disable pcplists to avoid races with page
> freeing in progress. This is done without extra checks in fast paths, as
> explained in Patch 9. The repeated draining done by [2] is then no longer
> needed. Previous version (RFC) is at [3].
> 
> The RFC tried to hide pcplists disabling/enabling into page isolation, but it
> wasn't completely possible, as memory offline does not unisolation. Michal
> suggested an explicit API in [4] so that's the current implementation and it
> seems indeed nicer.
> 
> Once we accept that page isolation users need to do explicit actions around it
> depending on the needed guarantees, we can also IMHO accept that the current
> pcplist draining can be also done by the callers, which is more effective.
> After all, there are only two users of page isolation. So patch 7 does
> effectively the same thing as Pavel proposed in [5], and patches 8-9 implement
> stronger guarantees only for memory offline. If CMA decides to opt-in to the
> stronger guarantee, it's easy to do so.
> 
> Patches 1-6 are preparatory cleanups for pcplist disabling.
> 
> Patchset was briefly tested in QEMU so that memory online/offline works, but
> I haven't done a stress test that would prove the race fixed by [2] is
> eliminated.
> 
> Note that patch 9 could be avoided if we instead adjusted page freeing in shown
> in [6], but I believe the current implementation of disabling pcplists is not
> too much complex, so I would prefer this instead of adding new checks and
> longer irq-disabled section into page freeing hotpaths.

Haven't looked into the details (yet), but I assume we can add some flag
to alloc_contig_range(), to also disable+flush+enable. (or let the
caller do it, for example on a bunch of bulk allocations - TBD).

Result of patch #9 looks quite clean.