diff mbox series

[v3,07/15] mm/memory_hotplug: Introduce offline_and_remove_memory()

Message ID 20200507103119.11219-8-david@redhat.com (mailing list archive)
State New, archived
Headers show
Series virtio-mem: paravirtualized memory | expand

Commit Message

David Hildenbrand May 7, 2020, 10:31 a.m. UTC
virtio-mem wants to offline and remove a memory block once it unplugged
all subblocks (e.g., using alloc_contig_range()). Let's provide
an interface to do that from a driver. virtio-mem already supports to
offline partially unplugged memory blocks. Offlining a fully unplugged
memory block will not require to migrate any pages. All unplugged
subblocks are PageOffline() and have a reference count of 0 - so
offlining code will simply skip them.

All we need is an interface to offline and remove the memory from kernel
module context, where we don't have access to the memory block devices
(esp. find_memory_block() and device_offline()) and the device hotplug
lock.

To keep things simple, allow to only work on a single memory block.

Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/memory_hotplug.h |  1 +
 mm/memory_hotplug.c            | 37 ++++++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+)

Comments

Michael S. Tsirkin May 7, 2020, 10:46 a.m. UTC | #1
On Thu, May 07, 2020 at 12:31:11PM +0200, David Hildenbrand wrote:
> virtio-mem wants to offline and remove a memory block once it unplugged
> all subblocks (e.g., using alloc_contig_range()). Let's provide
> an interface to do that from a driver. virtio-mem already supports to
> offline partially unplugged memory blocks. Offlining a fully unplugged
> memory block will not require to migrate any pages. All unplugged
> subblocks are PageOffline() and have a reference count of 0 - so
> offlining code will simply skip them.
> 
> All we need is an interface to offline and remove the memory from kernel
> module context, where we don't have access to the memory block devices
> (esp. find_memory_block() and device_offline()) and the device hotplug
> lock.
> 
> To keep things simple, allow to only work on a single memory block.
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> Tested-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Oscar Salvador <osalvador@suse.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> Cc: Wei Yang <richard.weiyang@gmail.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Qian Cai <cai@lca.pw>
> Signed-off-by: David Hildenbrand <david@redhat.com>


didn't you lose Andrew Morton's ack here?

> ---
>  include/linux/memory_hotplug.h |  1 +
>  mm/memory_hotplug.c            | 37 ++++++++++++++++++++++++++++++++++
>  2 files changed, 38 insertions(+)

I get:

error: sha1 information is lacking or useless (mm/memory_hotplug.c).
error: could not build fake ancestor

which version is this against? Pls post patches on top of some tag
in Linus' tree if possible.


> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 7dca9cd6076b..d641828e5596 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -318,6 +318,7 @@ extern void try_offline_node(int nid);
>  extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
>  extern int remove_memory(int nid, u64 start, u64 size);
>  extern void __remove_memory(int nid, u64 start, u64 size);
> +extern int offline_and_remove_memory(int nid, u64 start, u64 size);
>  
>  #else
>  static inline void try_offline_node(int nid) {}
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 936bfe208a6e..bf1941f02a60 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1748,4 +1748,41 @@ int remove_memory(int nid, u64 start, u64 size)
>  	return rc;
>  }
>  EXPORT_SYMBOL_GPL(remove_memory);
> +
> +/*
> + * Try to offline and remove a memory block. Might take a long time to
> + * finish in case memory is still in use. Primarily useful for memory devices
> + * that logically unplugged all memory (so it's no longer in use) and want to
> + * offline + remove the memory block.
> + */
> +int offline_and_remove_memory(int nid, u64 start, u64 size)
> +{
> +	struct memory_block *mem;
> +	int rc = -EINVAL;
> +
> +	if (!IS_ALIGNED(start, memory_block_size_bytes()) ||
> +	    size != memory_block_size_bytes())
> +		return rc;
> +
> +	lock_device_hotplug();
> +	mem = find_memory_block(__pfn_to_section(PFN_DOWN(start)));
> +	if (mem)
> +		rc = device_offline(&mem->dev);
> +	/* Ignore if the device is already offline. */
> +	if (rc > 0)
> +		rc = 0;
> +
> +	/*
> +	 * In case we succeeded to offline the memory block, remove it.
> +	 * This cannot fail as it cannot get onlined in the meantime.
> +	 */
> +	if (!rc) {
> +		rc = try_remove_memory(nid, start, size);
> +		WARN_ON_ONCE(rc);
> +	}
> +	unlock_device_hotplug();
> +
> +	return rc;
> +}
> +EXPORT_SYMBOL_GPL(offline_and_remove_memory);
>  #endif /* CONFIG_MEMORY_HOTREMOVE */
> -- 
> 2.25.3
David Hildenbrand May 7, 2020, 11:24 a.m. UTC | #2
On 07.05.20 12:46, Michael S. Tsirkin wrote:
> On Thu, May 07, 2020 at 12:31:11PM +0200, David Hildenbrand wrote:
>> virtio-mem wants to offline and remove a memory block once it unplugged
>> all subblocks (e.g., using alloc_contig_range()). Let's provide
>> an interface to do that from a driver. virtio-mem already supports to
>> offline partially unplugged memory blocks. Offlining a fully unplugged
>> memory block will not require to migrate any pages. All unplugged
>> subblocks are PageOffline() and have a reference count of 0 - so
>> offlining code will simply skip them.
>>
>> All we need is an interface to offline and remove the memory from kernel
>> module context, where we don't have access to the memory block devices
>> (esp. find_memory_block() and device_offline()) and the device hotplug
>> lock.
>>
>> To keep things simple, allow to only work on a single memory block.
>>
>> Acked-by: Michal Hocko <mhocko@suse.com>
>> Tested-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: David Hildenbrand <david@redhat.com>
>> Cc: Oscar Salvador <osalvador@suse.com>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
>> Cc: Wei Yang <richard.weiyang@gmail.com>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Qian Cai <cai@lca.pw>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> 
> didn't you lose Andrew Morton's ack here?

Yeah, thanks for noticing.

> 
>> ---
>>  include/linux/memory_hotplug.h |  1 +
>>  mm/memory_hotplug.c            | 37 ++++++++++++++++++++++++++++++++++
>>  2 files changed, 38 insertions(+)
> 
> I get:
> 
> error: sha1 information is lacking or useless (mm/memory_hotplug.c).
> error: could not build fake ancestor
> 
> which version is this against? Pls post patches on top of some tag
> in Linus' tree if possible.

As the cover states, latest linux-next. To be precise

commit 6b43f715b6379433e8eb30aa9bcc99bd6a585f77 (tag: next-20200507,
next/master)
Author: Stephen Rothwell <sfr@canb.auug.org.au>
Date:   Thu May 7 18:11:31 2020 +1000

    Add linux-next specific files for 20200507
David Hildenbrand May 7, 2020, 11:33 a.m. UTC | #3
>> I get:
>>
>> error: sha1 information is lacking or useless (mm/memory_hotplug.c).
>> error: could not build fake ancestor
>>
>> which version is this against? Pls post patches on top of some tag
>> in Linus' tree if possible.
> 
> As the cover states, latest linux-next. To be precise
> 
> commit 6b43f715b6379433e8eb30aa9bcc99bd6a585f77 (tag: next-20200507,
> next/master)
> Author: Stephen Rothwell <sfr@canb.auug.org.au>
> Date:   Thu May 7 18:11:31 2020 +1000
> 
>     Add linux-next specific files for 20200507
> 

The patches seem to apply cleanly on top of

commit a811c1fa0a02c062555b54651065899437bacdbe (linus/master)
Merge: b9388959ba50 16f8036086a9
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Wed May 6 20:53:22 2020 -0700

    Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net


I can resend based on that, after giving it a short test.
Michael S. Tsirkin May 7, 2020, 11:33 a.m. UTC | #4
On Thu, May 07, 2020 at 01:24:38PM +0200, David Hildenbrand wrote:
> On 07.05.20 12:46, Michael S. Tsirkin wrote:
> > On Thu, May 07, 2020 at 12:31:11PM +0200, David Hildenbrand wrote:
> >> virtio-mem wants to offline and remove a memory block once it unplugged
> >> all subblocks (e.g., using alloc_contig_range()). Let's provide
> >> an interface to do that from a driver. virtio-mem already supports to
> >> offline partially unplugged memory blocks. Offlining a fully unplugged
> >> memory block will not require to migrate any pages. All unplugged
> >> subblocks are PageOffline() and have a reference count of 0 - so
> >> offlining code will simply skip them.
> >>
> >> All we need is an interface to offline and remove the memory from kernel
> >> module context, where we don't have access to the memory block devices
> >> (esp. find_memory_block() and device_offline()) and the device hotplug
> >> lock.
> >>
> >> To keep things simple, allow to only work on a single memory block.
> >>
> >> Acked-by: Michal Hocko <mhocko@suse.com>
> >> Tested-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> >> Cc: Andrew Morton <akpm@linux-foundation.org>
> >> Cc: David Hildenbrand <david@redhat.com>
> >> Cc: Oscar Salvador <osalvador@suse.com>
> >> Cc: Michal Hocko <mhocko@suse.com>
> >> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> >> Cc: Wei Yang <richard.weiyang@gmail.com>
> >> Cc: Dan Williams <dan.j.williams@intel.com>
> >> Cc: Qian Cai <cai@lca.pw>
> >> Signed-off-by: David Hildenbrand <david@redhat.com>
> > 
> > 
> > didn't you lose Andrew Morton's ack here?
> 
> Yeah, thanks for noticing.
> 
> > 
> >> ---
> >>  include/linux/memory_hotplug.h |  1 +
> >>  mm/memory_hotplug.c            | 37 ++++++++++++++++++++++++++++++++++
> >>  2 files changed, 38 insertions(+)
> > 
> > I get:
> > 
> > error: sha1 information is lacking or useless (mm/memory_hotplug.c).
> > error: could not build fake ancestor
> > 
> > which version is this against? Pls post patches on top of some tag
> > in Linus' tree if possible.
> 
> As the cover states, latest linux-next. To be precise
> 
> commit 6b43f715b6379433e8eb30aa9bcc99bd6a585f77 (tag: next-20200507,
> next/master)
> Author: Stephen Rothwell <sfr@canb.auug.org.au>
> Date:   Thu May 7 18:11:31 2020 +1000
> 
>     Add linux-next specific files for 20200507
> 

Don't base on linux-next please. Generally base on the tree you are
targeting, or Linus' tree.


> -- 
> Thanks,
> 
> David / dhildenb
Michael S. Tsirkin May 7, 2020, 11:34 a.m. UTC | #5
On Thu, May 07, 2020 at 01:33:23PM +0200, David Hildenbrand wrote:
> >> I get:
> >>
> >> error: sha1 information is lacking or useless (mm/memory_hotplug.c).
> >> error: could not build fake ancestor
> >>
> >> which version is this against? Pls post patches on top of some tag
> >> in Linus' tree if possible.
> > 
> > As the cover states, latest linux-next. To be precise
> > 
> > commit 6b43f715b6379433e8eb30aa9bcc99bd6a585f77 (tag: next-20200507,
> > next/master)
> > Author: Stephen Rothwell <sfr@canb.auug.org.au>
> > Date:   Thu May 7 18:11:31 2020 +1000
> > 
> >     Add linux-next specific files for 20200507
> > 
> 
> The patches seem to apply cleanly on top of
> 
> commit a811c1fa0a02c062555b54651065899437bacdbe (linus/master)
> Merge: b9388959ba50 16f8036086a9
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Wed May 6 20:53:22 2020 -0700
> 
>     Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Because you have the relevant hashes in your git tree not pruned yet.
Do a new clone and they won't apply.

> 
> I can resend based on that, after giving it a short test.
> 
> -- 
> Thanks,
> 
> David / dhildenb
David Hildenbrand May 7, 2020, 11:37 a.m. UTC | #6
On 07.05.20 13:34, Michael S. Tsirkin wrote:
> On Thu, May 07, 2020 at 01:33:23PM +0200, David Hildenbrand wrote:
>>>> I get:
>>>>
>>>> error: sha1 information is lacking or useless (mm/memory_hotplug.c).
>>>> error: could not build fake ancestor
>>>>
>>>> which version is this against? Pls post patches on top of some tag
>>>> in Linus' tree if possible.
>>>
>>> As the cover states, latest linux-next. To be precise
>>>
>>> commit 6b43f715b6379433e8eb30aa9bcc99bd6a585f77 (tag: next-20200507,
>>> next/master)
>>> Author: Stephen Rothwell <sfr@canb.auug.org.au>
>>> Date:   Thu May 7 18:11:31 2020 +1000
>>>
>>>     Add linux-next specific files for 20200507
>>>
>>
>> The patches seem to apply cleanly on top of
>>
>> commit a811c1fa0a02c062555b54651065899437bacdbe (linus/master)
>> Merge: b9388959ba50 16f8036086a9
>> Author: Linus Torvalds <torvalds@linux-foundation.org>
>> Date:   Wed May 6 20:53:22 2020 -0700
>>
>>     Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
> 
> Because you have the relevant hashes in your git tree not pruned yet.
> Do a new clone and they won't apply.
> 

Yeah, most probably, it knows how to merge. I'm used to sending all my
-mm stuff based on -next, so this here is different.

I'll wait a bit and then send v4 based on latest linus/master, adding
the two acks and reshuffling the MAINTAINERS patch. Thanks.
Michael S. Tsirkin May 7, 2020, 12:11 p.m. UTC | #7
On Thu, May 07, 2020 at 01:37:30PM +0200, David Hildenbrand wrote:
> On 07.05.20 13:34, Michael S. Tsirkin wrote:
> > On Thu, May 07, 2020 at 01:33:23PM +0200, David Hildenbrand wrote:
> >>>> I get:
> >>>>
> >>>> error: sha1 information is lacking or useless (mm/memory_hotplug.c).
> >>>> error: could not build fake ancestor
> >>>>
> >>>> which version is this against? Pls post patches on top of some tag
> >>>> in Linus' tree if possible.
> >>>
> >>> As the cover states, latest linux-next. To be precise
> >>>
> >>> commit 6b43f715b6379433e8eb30aa9bcc99bd6a585f77 (tag: next-20200507,
> >>> next/master)
> >>> Author: Stephen Rothwell <sfr@canb.auug.org.au>
> >>> Date:   Thu May 7 18:11:31 2020 +1000
> >>>
> >>>     Add linux-next specific files for 20200507
> >>>
> >>
> >> The patches seem to apply cleanly on top of
> >>
> >> commit a811c1fa0a02c062555b54651065899437bacdbe (linus/master)
> >> Merge: b9388959ba50 16f8036086a9
> >> Author: Linus Torvalds <torvalds@linux-foundation.org>
> >> Date:   Wed May 6 20:53:22 2020 -0700
> >>
> >>     Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
> > 
> > Because you have the relevant hashes in your git tree not pruned yet.
> > Do a new clone and they won't apply.
> > 
> 
> Yeah, most probably, it knows how to merge. I'm used to sending all my
> -mm stuff based on -next, so this here is different.


Documentation/process/5.Posting.rst addresses this:


Patches must be prepared against a specific version of the kernel.  As a
general rule, a patch should be based on the current mainline as found in
Linus's git tree.  When basing on mainline, start with a well-known release
point - a stable or -rc release - rather than branching off the mainline at
an arbitrary spot.

It may become necessary to make versions against -mm, linux-next, or a
subsystem tree, though, to facilitate wider testing and review.  Depending
on the area of your patch and what is going on elsewhere, basing a patch
against these other trees can require a significant amount of work
resolving conflicts and dealing with API changes.





> I'll wait a bit and then send v4 based on latest linus/master, adding
> the two acks and reshuffling the MAINTAINERS patch. Thanks.
> 
> -- 
> Thanks,
> 
> David / dhildenb
David Hildenbrand May 7, 2020, 12:24 p.m. UTC | #8
On 07.05.20 14:11, Michael S. Tsirkin wrote:
> On Thu, May 07, 2020 at 01:37:30PM +0200, David Hildenbrand wrote:
>> On 07.05.20 13:34, Michael S. Tsirkin wrote:
>>> On Thu, May 07, 2020 at 01:33:23PM +0200, David Hildenbrand wrote:
>>>>>> I get:
>>>>>>
>>>>>> error: sha1 information is lacking or useless (mm/memory_hotplug.c).
>>>>>> error: could not build fake ancestor
>>>>>>
>>>>>> which version is this against? Pls post patches on top of some tag
>>>>>> in Linus' tree if possible.
>>>>>
>>>>> As the cover states, latest linux-next. To be precise
>>>>>
>>>>> commit 6b43f715b6379433e8eb30aa9bcc99bd6a585f77 (tag: next-20200507,
>>>>> next/master)
>>>>> Author: Stephen Rothwell <sfr@canb.auug.org.au>
>>>>> Date:   Thu May 7 18:11:31 2020 +1000
>>>>>
>>>>>     Add linux-next specific files for 20200507
>>>>>
>>>>
>>>> The patches seem to apply cleanly on top of
>>>>
>>>> commit a811c1fa0a02c062555b54651065899437bacdbe (linus/master)
>>>> Merge: b9388959ba50 16f8036086a9
>>>> Author: Linus Torvalds <torvalds@linux-foundation.org>
>>>> Date:   Wed May 6 20:53:22 2020 -0700
>>>>
>>>>     Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
>>>
>>> Because you have the relevant hashes in your git tree not pruned yet.
>>> Do a new clone and they won't apply.
>>>
>>
>> Yeah, most probably, it knows how to merge. I'm used to sending all my
>> -mm stuff based on -next, so this here is different.
> 
> 
> Documentation/process/5.Posting.rst addresses this:
> 

Thanks for the info.

> 
> Patches must be prepared against a specific version of the kernel.  As a
> general rule, a patch should be based on the current mainline as found in
> Linus's git tree.  When basing on mainline, start with a well-known release
> point - a stable or -rc release - rather than branching off the mainline at
> an arbitrary spot.
> 
> It may become necessary to make versions against -mm, linux-next, or a
> subsystem tree, though, to facilitate wider testing and review.  Depending
> on the area of your patch and what is going on elsewhere, basing a patch
> against these other trees can require a significant amount of work
> resolving conflicts and dealing with API changes.

Yeah, but with -mm patches it is completely impractical to base them
against Linus's git tree.
diff mbox series

Patch

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 7dca9cd6076b..d641828e5596 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -318,6 +318,7 @@  extern void try_offline_node(int nid);
 extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
 extern int remove_memory(int nid, u64 start, u64 size);
 extern void __remove_memory(int nid, u64 start, u64 size);
+extern int offline_and_remove_memory(int nid, u64 start, u64 size);
 
 #else
 static inline void try_offline_node(int nid) {}
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 936bfe208a6e..bf1941f02a60 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1748,4 +1748,41 @@  int remove_memory(int nid, u64 start, u64 size)
 	return rc;
 }
 EXPORT_SYMBOL_GPL(remove_memory);
+
+/*
+ * Try to offline and remove a memory block. Might take a long time to
+ * finish in case memory is still in use. Primarily useful for memory devices
+ * that logically unplugged all memory (so it's no longer in use) and want to
+ * offline + remove the memory block.
+ */
+int offline_and_remove_memory(int nid, u64 start, u64 size)
+{
+	struct memory_block *mem;
+	int rc = -EINVAL;
+
+	if (!IS_ALIGNED(start, memory_block_size_bytes()) ||
+	    size != memory_block_size_bytes())
+		return rc;
+
+	lock_device_hotplug();
+	mem = find_memory_block(__pfn_to_section(PFN_DOWN(start)));
+	if (mem)
+		rc = device_offline(&mem->dev);
+	/* Ignore if the device is already offline. */
+	if (rc > 0)
+		rc = 0;
+
+	/*
+	 * In case we succeeded to offline the memory block, remove it.
+	 * This cannot fail as it cannot get onlined in the meantime.
+	 */
+	if (!rc) {
+		rc = try_remove_memory(nid, start, size);
+		WARN_ON_ONCE(rc);
+	}
+	unlock_device_hotplug();
+
+	return rc;
+}
+EXPORT_SYMBOL_GPL(offline_and_remove_memory);
 #endif /* CONFIG_MEMORY_HOTREMOVE */