diff mbox series

[v3,6/6] mm: document semantics of ZONE_MOVABLE

Message ID 20200804072408.5481-7-david@redhat.com (mailing list archive)
State New, archived
Headers show
Series mm / virtio-mem: support ZONE_MOVABLE | expand

Commit Message

David Hildenbrand Aug. 4, 2020, 7:24 a.m. UTC
Let's document what ZONE_MOVABLE means, how it's used, and which special
cases we have regarding unmovable pages (memory offlining vs. migration /
allocations).

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/mmzone.h | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

Comments

Mike Rapoport Aug. 4, 2020, 9:33 a.m. UTC | #1
On Tue, Aug 04, 2020 at 09:24:08AM +0200, David Hildenbrand wrote:
> Let's document what ZONE_MOVABLE means, how it's used, and which special
> cases we have regarding unmovable pages (memory offlining vs. migration /
> allocations).
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Mike Rapoport <rppt@kernel.org>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Baoquan He <bhe@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Several nits below, othersize

Acked-by: Mike Rapoport <rppt@linux.ibm.com>

> ---
>  include/linux/mmzone.h | 34 ++++++++++++++++++++++++++++++++++
>  1 file changed, 34 insertions(+)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index f6f884970511d..600d449e7d9e9 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -372,6 +372,40 @@ enum zone_type {
>  	 */
>  	ZONE_HIGHMEM,
>  #endif
> +	/*
> +	 * ZONE_MOVABLE is similar to ZONE_NORMAL, except that it *primarily*
> +	 * only contains movable pages. Main use cases are to make memory

"Primarily only" sounds awkward. Maybe

	... except that it only contains movable pages with few exceptional
	cases described below. 

And then 

	Main use cases for ZONE_MOVABLE are ...

> +	 * offlining more likely to succeed, and to locally limit unmovable
> +	 * allocations - e.g., to increase the number of THP/huge pages.
> +	 * Notable special cases are:
> +	 *
> +	 * 1. Pinned pages: (Long-term) pinning of movable pages might

		            ^long, capital L looked out of place for me

> +	 *    essentially turn such pages unmovable. Memory offlining might
> +	 *    retry a long time.
> +	 * 2. memblock allocations: kernelcore/movablecore setups might create
> +	 *    situations where ZONE_MOVABLE contains unmovable allocations
> +	 *    after boot. Memory offlining and allocations fail early.
> +	 * 3. Memory holes: Such pages cannot be allocated. Applies only to
> +	 *    boot memory, not hotplugged memory. Memory offlining and
> +	 *    allocations fail early.

I would clarify where page struct for abscent memory come from

> +	 * 4. PG_hwpoison pages: While poisoned pages can be skipped during
> +	 *    memory offlining, such pages cannot be allocated.
> +	 * 5. Unmovable PG_offline pages: In paravirtualized environments,
> +	 *    hotplugged memory blocks might only partially be managed by the
> +	 *    buddy (e.g., via XEN-balloon, Hyper-V balloon, virtio-mem). The
> +	 *    parts not manged by the buddy are unmovable PG_offline pages. In
> +	 *    some cases (virtio-mem), such pages can be skipped during
> +	 *    memory offlining, however, cannot be moved/allocated. These
> +	 *    techniques might use alloc_contig_range() to hide previously
> +	 *    exposed pages from the buddy again (e.g., to implement some sort
> +	 *    of memory unplug in virtio-mem).
> +	 *
> +	 * In general, no unmovable allocations that degrade memory offlining
> +	 * should end up in ZONE_MOVABLE. Allocators (like alloc_contig_range())
> +	 * have to expect that migrating pages in ZONE_MOVABLE can fail (even
> +	 * if has_unmovable_pages() states that there are no unmovable pages,
> +	 * there can be false negatives).
> +	 */
>  	ZONE_MOVABLE,
>  #ifdef CONFIG_ZONE_DEVICE
>  	ZONE_DEVICE,
> -- 
> 2.26.2
>
David Hildenbrand Aug. 4, 2020, 9:55 a.m. UTC | #2
On 04.08.20 11:33, Mike Rapoport wrote:
> On Tue, Aug 04, 2020 at 09:24:08AM +0200, David Hildenbrand wrote:
>> Let's document what ZONE_MOVABLE means, how it's used, and which special
>> cases we have regarding unmovable pages (memory offlining vs. migration /
>> allocations).
>>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Michael S. Tsirkin <mst@redhat.com>
>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>> Cc: Mike Rapoport <rppt@kernel.org>
>> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
>> Cc: Baoquan He <bhe@redhat.com>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> Several nits below, othersize
> 
> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
> 
>> ---
>>  include/linux/mmzone.h | 34 ++++++++++++++++++++++++++++++++++
>>  1 file changed, 34 insertions(+)
>>
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index f6f884970511d..600d449e7d9e9 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -372,6 +372,40 @@ enum zone_type {
>>  	 */
>>  	ZONE_HIGHMEM,
>>  #endif
>> +	/*
>> +	 * ZONE_MOVABLE is similar to ZONE_NORMAL, except that it *primarily*
>> +	 * only contains movable pages. Main use cases are to make memory
> 
> "Primarily only" sounds awkward. Maybe
> 
> 	... except that it only contains movable pages with few exceptional
> 	cases described below. 
> 
> And then 
> 
> 	Main use cases for ZONE_MOVABLE are ...

Ack!

> 
>> +	 * offlining more likely to succeed, and to locally limit unmovable
>> +	 * allocations - e.g., to increase the number of THP/huge pages.
>> +	 * Notable special cases are:
>> +	 *
>> +	 * 1. Pinned pages: (Long-term) pinning of movable pages might
> 
> 		            ^long, capital L looked out of place for me

Ack!

> 
>> +	 *    essentially turn such pages unmovable. Memory offlining might
>> +	 *    retry a long time.
>> +	 * 2. memblock allocations: kernelcore/movablecore setups might create
>> +	 *    situations where ZONE_MOVABLE contains unmovable allocations
>> +	 *    after boot. Memory offlining and allocations fail early.
>> +	 * 3. Memory holes: Such pages cannot be allocated. Applies only to
>> +	 *    boot memory, not hotplugged memory. Memory offlining and
>> +	 *    allocations fail early.
> 
> I would clarify where page struct for abscent memory come from

Something like:

Memory holes: We might have a memmap for memory holes, for example, if
we have sections that are only partially System RAM. Such pages cannot
be ...

?

Thanks!
Mike Rapoport Aug. 4, 2020, 10:03 a.m. UTC | #3
On Tue, Aug 04, 2020 at 11:55:10AM +0200, David Hildenbrand wrote:
> On 04.08.20 11:33, Mike Rapoport wrote:
> > On Tue, Aug 04, 2020 at 09:24:08AM +0200, David Hildenbrand wrote:
> >> Let's document what ZONE_MOVABLE means, how it's used, and which special
> >> cases we have regarding unmovable pages (memory offlining vs. migration /
> >> allocations).
> >>
> >> Cc: Andrew Morton <akpm@linux-foundation.org>
> >> Cc: Michal Hocko <mhocko@suse.com>
> >> Cc: Michael S. Tsirkin <mst@redhat.com>
> >> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> >> Cc: Mike Rapoport <rppt@kernel.org>
> >> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> >> Cc: Baoquan He <bhe@redhat.com>
> >> Signed-off-by: David Hildenbrand <david@redhat.com>
> > 
> > Several nits below, othersize
> > 
> > Acked-by: Mike Rapoport <rppt@linux.ibm.com>
> > 
> >> ---
> >>  include/linux/mmzone.h | 34 ++++++++++++++++++++++++++++++++++
> >>  1 file changed, 34 insertions(+)
> >>
> >> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> >> index f6f884970511d..600d449e7d9e9 100644
> >> --- a/include/linux/mmzone.h
> >> +++ b/include/linux/mmzone.h
> >> @@ -372,6 +372,40 @@ enum zone_type {
> >>  	 */
> >>  	ZONE_HIGHMEM,
> >>  #endif
> >> +	/*
> >> +	 * ZONE_MOVABLE is similar to ZONE_NORMAL, except that it *primarily*
> >> +	 * only contains movable pages. Main use cases are to make memory
> > 
> > "Primarily only" sounds awkward. Maybe
> > 
> > 	... except that it only contains movable pages with few exceptional
> > 	cases described below. 
> > 
> > And then 
> > 
> > 	Main use cases for ZONE_MOVABLE are ...
> 
> Ack!
> 
> > 
> >> +	 * offlining more likely to succeed, and to locally limit unmovable
> >> +	 * allocations - e.g., to increase the number of THP/huge pages.
> >> +	 * Notable special cases are:
> >> +	 *
> >> +	 * 1. Pinned pages: (Long-term) pinning of movable pages might
> > 
> > 		            ^long, capital L looked out of place for me
> 
> Ack!
> 
> > 
> >> +	 *    essentially turn such pages unmovable. Memory offlining might
> >> +	 *    retry a long time.
> >> +	 * 2. memblock allocations: kernelcore/movablecore setups might create
> >> +	 *    situations where ZONE_MOVABLE contains unmovable allocations
> >> +	 *    after boot. Memory offlining and allocations fail early.
> >> +	 * 3. Memory holes: Such pages cannot be allocated. Applies only to
> >> +	 *    boot memory, not hotplugged memory. Memory offlining and
> >> +	 *    allocations fail early.
> > 
> > I would clarify where page struct for abscent memory come from
> 
> Something like:
> 
> Memory holes: We might have a memmap for memory holes, for example, if

               ^w ;-)

> we have sections that are only partially System RAM. Such pages cannot
> be ...

How about

... sections that are only partially populated 

?
 
> ?
> 
> Thanks!
> 
> -- 
> Thanks,
> 
> David / dhildenb
>
David Hildenbrand Aug. 4, 2020, 10:04 a.m. UTC | #4
On 04.08.20 12:03, Mike Rapoport wrote:
> On Tue, Aug 04, 2020 at 11:55:10AM +0200, David Hildenbrand wrote:
>> On 04.08.20 11:33, Mike Rapoport wrote:
>>> On Tue, Aug 04, 2020 at 09:24:08AM +0200, David Hildenbrand wrote:
>>>> Let's document what ZONE_MOVABLE means, how it's used, and which special
>>>> cases we have regarding unmovable pages (memory offlining vs. migration /
>>>> allocations).
>>>>
>>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>>> Cc: Michal Hocko <mhocko@suse.com>
>>>> Cc: Michael S. Tsirkin <mst@redhat.com>
>>>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>>>> Cc: Mike Rapoport <rppt@kernel.org>
>>>> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
>>>> Cc: Baoquan He <bhe@redhat.com>
>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>
>>> Several nits below, othersize
>>>
>>> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
>>>
>>>> ---
>>>>  include/linux/mmzone.h | 34 ++++++++++++++++++++++++++++++++++
>>>>  1 file changed, 34 insertions(+)
>>>>
>>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>>>> index f6f884970511d..600d449e7d9e9 100644
>>>> --- a/include/linux/mmzone.h
>>>> +++ b/include/linux/mmzone.h
>>>> @@ -372,6 +372,40 @@ enum zone_type {
>>>>  	 */
>>>>  	ZONE_HIGHMEM,
>>>>  #endif
>>>> +	/*
>>>> +	 * ZONE_MOVABLE is similar to ZONE_NORMAL, except that it *primarily*
>>>> +	 * only contains movable pages. Main use cases are to make memory
>>>
>>> "Primarily only" sounds awkward. Maybe
>>>
>>> 	... except that it only contains movable pages with few exceptional
>>> 	cases described below. 
>>>
>>> And then 
>>>
>>> 	Main use cases for ZONE_MOVABLE are ...
>>
>> Ack!
>>
>>>
>>>> +	 * offlining more likely to succeed, and to locally limit unmovable
>>>> +	 * allocations - e.g., to increase the number of THP/huge pages.
>>>> +	 * Notable special cases are:
>>>> +	 *
>>>> +	 * 1. Pinned pages: (Long-term) pinning of movable pages might
>>>
>>> 		            ^long, capital L looked out of place for me
>>
>> Ack!
>>
>>>
>>>> +	 *    essentially turn such pages unmovable. Memory offlining might
>>>> +	 *    retry a long time.
>>>> +	 * 2. memblock allocations: kernelcore/movablecore setups might create
>>>> +	 *    situations where ZONE_MOVABLE contains unmovable allocations
>>>> +	 *    after boot. Memory offlining and allocations fail early.
>>>> +	 * 3. Memory holes: Such pages cannot be allocated. Applies only to
>>>> +	 *    boot memory, not hotplugged memory. Memory offlining and
>>>> +	 *    allocations fail early.
>>>
>>> I would clarify where page struct for abscent memory come from
>>
>> Something like:
>>
>> Memory holes: We might have a memmap for memory holes, for example, if
> 
>                ^w ;-)
> 
>> we have sections that are only partially System RAM. Such pages cannot
>> be ...
> 
> How about
> 
> ... sections that are only partially populated 
> 
> ?

Yeah, shorter. Thanks!
diff mbox series

Patch

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index f6f884970511d..600d449e7d9e9 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -372,6 +372,40 @@  enum zone_type {
 	 */
 	ZONE_HIGHMEM,
 #endif
+	/*
+	 * ZONE_MOVABLE is similar to ZONE_NORMAL, except that it *primarily*
+	 * only contains movable pages. Main use cases are to make memory
+	 * offlining more likely to succeed, and to locally limit unmovable
+	 * allocations - e.g., to increase the number of THP/huge pages.
+	 * Notable special cases are:
+	 *
+	 * 1. Pinned pages: (Long-term) pinning of movable pages might
+	 *    essentially turn such pages unmovable. Memory offlining might
+	 *    retry a long time.
+	 * 2. memblock allocations: kernelcore/movablecore setups might create
+	 *    situations where ZONE_MOVABLE contains unmovable allocations
+	 *    after boot. Memory offlining and allocations fail early.
+	 * 3. Memory holes: Such pages cannot be allocated. Applies only to
+	 *    boot memory, not hotplugged memory. Memory offlining and
+	 *    allocations fail early.
+	 * 4. PG_hwpoison pages: While poisoned pages can be skipped during
+	 *    memory offlining, such pages cannot be allocated.
+	 * 5. Unmovable PG_offline pages: In paravirtualized environments,
+	 *    hotplugged memory blocks might only partially be managed by the
+	 *    buddy (e.g., via XEN-balloon, Hyper-V balloon, virtio-mem). The
+	 *    parts not manged by the buddy are unmovable PG_offline pages. In
+	 *    some cases (virtio-mem), such pages can be skipped during
+	 *    memory offlining, however, cannot be moved/allocated. These
+	 *    techniques might use alloc_contig_range() to hide previously
+	 *    exposed pages from the buddy again (e.g., to implement some sort
+	 *    of memory unplug in virtio-mem).
+	 *
+	 * In general, no unmovable allocations that degrade memory offlining
+	 * should end up in ZONE_MOVABLE. Allocators (like alloc_contig_range())
+	 * have to expect that migrating pages in ZONE_MOVABLE can fail (even
+	 * if has_unmovable_pages() states that there are no unmovable pages,
+	 * there can be false negatives).
+	 */
 	ZONE_MOVABLE,
 #ifdef CONFIG_ZONE_DEVICE
 	ZONE_DEVICE,