diff mbox series

[v1,07/11] virtio-mem: Allow to offline partially unplugged memory blocks

Message ID 20200302134941.315212-8-david@redhat.com (mailing list archive)
State New, archived
Headers show
Series virtio-mem: paravirtualized memory | expand

Commit Message

David Hildenbrand March 2, 2020, 1:49 p.m. UTC
Dropping the reference count of PageOffline() pages allows offlining
code to skip them. However, we also have to convert PG_reserved to
another flag - let's use PG_dirty - so has_unmovable_pages() will
properly handle them. PG_reserved pages get detected as unmovable right
away.

We need the flag to see if we are onlining pages the first time, or if
we allocated them via alloc_contig_range().

Properly take care of offlining code also modifying the stats and
special handling in case the driver gets unloaded.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/virtio/virtio_mem.c | 64 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 63 insertions(+), 1 deletion(-)

Comments

Michal Hocko March 10, 2020, 11:43 a.m. UTC | #1
On Mon 02-03-20 14:49:37, David Hildenbrand wrote:
[...]
> +static void virtio_mem_notify_going_offline(struct virtio_mem *vm,
> +					    unsigned long mb_id)
> +{
> +	const unsigned long nr_pages = PFN_DOWN(vm->subblock_size);
> +	unsigned long pfn;
> +	int sb_id, i;
> +
> +	for (sb_id = 0; sb_id < vm->nb_sb_per_mb; sb_id++) {
> +		if (virtio_mem_mb_test_sb_plugged(vm, mb_id, sb_id, 1))
> +			continue;
> +		/*
> +		 * Drop our reference to the pages so the memory can get
> +		 * offlined and add the unplugged pages to the managed
> +		 * page counters (so offlining code can correctly subtract
> +		 * them again).
> +		 */
> +		pfn = PFN_DOWN(virtio_mem_mb_id_to_phys(mb_id) +
> +			       sb_id * vm->subblock_size);
> +		adjust_managed_page_count(pfn_to_page(pfn), nr_pages);
> +		for (i = 0; i < nr_pages; i++)
> +			page_ref_dec(pfn_to_page(pfn + i));

Is there ever situation this might be a different than 1->0 transition?
David Hildenbrand March 10, 2020, 11:46 a.m. UTC | #2
On 10.03.20 12:43, Michal Hocko wrote:
> On Mon 02-03-20 14:49:37, David Hildenbrand wrote:
> [...]
>> +static void virtio_mem_notify_going_offline(struct virtio_mem *vm,
>> +					    unsigned long mb_id)
>> +{
>> +	const unsigned long nr_pages = PFN_DOWN(vm->subblock_size);
>> +	unsigned long pfn;
>> +	int sb_id, i;
>> +
>> +	for (sb_id = 0; sb_id < vm->nb_sb_per_mb; sb_id++) {
>> +		if (virtio_mem_mb_test_sb_plugged(vm, mb_id, sb_id, 1))
>> +			continue;
>> +		/*
>> +		 * Drop our reference to the pages so the memory can get
>> +		 * offlined and add the unplugged pages to the managed
>> +		 * page counters (so offlining code can correctly subtract
>> +		 * them again).
>> +		 */
>> +		pfn = PFN_DOWN(virtio_mem_mb_id_to_phys(mb_id) +
>> +			       sb_id * vm->subblock_size);
>> +		adjust_managed_page_count(pfn_to_page(pfn), nr_pages);
>> +		for (i = 0; i < nr_pages; i++)
>> +			page_ref_dec(pfn_to_page(pfn + i));
> 
> Is there ever situation this might be a different than 1->0 transition?

Only if some other code would be taking a reference. At least not from
virtio-mem perspective.
Michal Hocko March 10, 2020, 11:59 a.m. UTC | #3
On Tue 10-03-20 12:46:05, David Hildenbrand wrote:
> On 10.03.20 12:43, Michal Hocko wrote:
> > On Mon 02-03-20 14:49:37, David Hildenbrand wrote:
> > [...]
> >> +static void virtio_mem_notify_going_offline(struct virtio_mem *vm,
> >> +					    unsigned long mb_id)
> >> +{
> >> +	const unsigned long nr_pages = PFN_DOWN(vm->subblock_size);
> >> +	unsigned long pfn;
> >> +	int sb_id, i;
> >> +
> >> +	for (sb_id = 0; sb_id < vm->nb_sb_per_mb; sb_id++) {
> >> +		if (virtio_mem_mb_test_sb_plugged(vm, mb_id, sb_id, 1))
> >> +			continue;
> >> +		/*
> >> +		 * Drop our reference to the pages so the memory can get
> >> +		 * offlined and add the unplugged pages to the managed
> >> +		 * page counters (so offlining code can correctly subtract
> >> +		 * them again).
> >> +		 */
> >> +		pfn = PFN_DOWN(virtio_mem_mb_id_to_phys(mb_id) +
> >> +			       sb_id * vm->subblock_size);
> >> +		adjust_managed_page_count(pfn_to_page(pfn), nr_pages);
> >> +		for (i = 0; i < nr_pages; i++)
> >> +			page_ref_dec(pfn_to_page(pfn + i));
> > 
> > Is there ever situation this might be a different than 1->0 transition?
> 
> Only if some other code would be taking a reference. At least not from
> virtio-mem perspective.

OK, so that is essentially an error condition. I think it shouldn't go
silent and you want something like
	if (WARN_ON(!page_ref_sub_and_test(page)))
		dump_page(pfn_to_page(pfn + i), "YOUR REASON");
David Hildenbrand March 10, 2020, 12:09 p.m. UTC | #4
On 10.03.20 12:59, Michal Hocko wrote:
> On Tue 10-03-20 12:46:05, David Hildenbrand wrote:
>> On 10.03.20 12:43, Michal Hocko wrote:
>>> On Mon 02-03-20 14:49:37, David Hildenbrand wrote:
>>> [...]
>>>> +static void virtio_mem_notify_going_offline(struct virtio_mem *vm,
>>>> +					    unsigned long mb_id)
>>>> +{
>>>> +	const unsigned long nr_pages = PFN_DOWN(vm->subblock_size);
>>>> +	unsigned long pfn;
>>>> +	int sb_id, i;
>>>> +
>>>> +	for (sb_id = 0; sb_id < vm->nb_sb_per_mb; sb_id++) {
>>>> +		if (virtio_mem_mb_test_sb_plugged(vm, mb_id, sb_id, 1))
>>>> +			continue;
>>>> +		/*
>>>> +		 * Drop our reference to the pages so the memory can get
>>>> +		 * offlined and add the unplugged pages to the managed
>>>> +		 * page counters (so offlining code can correctly subtract
>>>> +		 * them again).
>>>> +		 */
>>>> +		pfn = PFN_DOWN(virtio_mem_mb_id_to_phys(mb_id) +
>>>> +			       sb_id * vm->subblock_size);
>>>> +		adjust_managed_page_count(pfn_to_page(pfn), nr_pages);
>>>> +		for (i = 0; i < nr_pages; i++)
>>>> +			page_ref_dec(pfn_to_page(pfn + i));
>>>
>>> Is there ever situation this might be a different than 1->0 transition?
>>
>> Only if some other code would be taking a reference. At least not from
>> virtio-mem perspective.
> 
> OK, so that is essentially an error condition. I think it shouldn't go
> silent and you want something like
> 	if (WARN_ON(!page_ref_sub_and_test(page)))
> 		dump_page(pfn_to_page(pfn + i), "YOUR REASON");
> 

Makes sense -  I'll most probably convert this to a WARN_ON_ONCE.

Thanks!
diff mbox series

Patch

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 5b26d57be551..2916f8b970fa 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -570,6 +570,53 @@  static void virtio_mem_notify_online(struct virtio_mem *vm, unsigned long mb_id,
 		virtio_mem_retry(vm);
 }
 
+static void virtio_mem_notify_going_offline(struct virtio_mem *vm,
+					    unsigned long mb_id)
+{
+	const unsigned long nr_pages = PFN_DOWN(vm->subblock_size);
+	unsigned long pfn;
+	int sb_id, i;
+
+	for (sb_id = 0; sb_id < vm->nb_sb_per_mb; sb_id++) {
+		if (virtio_mem_mb_test_sb_plugged(vm, mb_id, sb_id, 1))
+			continue;
+		/*
+		 * Drop our reference to the pages so the memory can get
+		 * offlined and add the unplugged pages to the managed
+		 * page counters (so offlining code can correctly subtract
+		 * them again).
+		 */
+		pfn = PFN_DOWN(virtio_mem_mb_id_to_phys(mb_id) +
+			       sb_id * vm->subblock_size);
+		adjust_managed_page_count(pfn_to_page(pfn), nr_pages);
+		for (i = 0; i < nr_pages; i++)
+			page_ref_dec(pfn_to_page(pfn + i));
+	}
+}
+
+static void virtio_mem_notify_cancel_offline(struct virtio_mem *vm,
+					     unsigned long mb_id)
+{
+	const unsigned long nr_pages = PFN_DOWN(vm->subblock_size);
+	unsigned long pfn;
+	int sb_id, i;
+
+	for (sb_id = 0; sb_id < vm->nb_sb_per_mb; sb_id++) {
+		if (virtio_mem_mb_test_sb_plugged(vm, mb_id, sb_id, 1))
+			continue;
+		/*
+		 * Get the reference we dropped when going offline and
+		 * subtract the unplugged pages from the managed page
+		 * counters.
+		 */
+		pfn = PFN_DOWN(virtio_mem_mb_id_to_phys(mb_id) +
+			       sb_id * vm->subblock_size);
+		adjust_managed_page_count(pfn_to_page(pfn), -nr_pages);
+		for (i = 0; i < nr_pages; i++)
+			page_ref_inc(pfn_to_page(pfn + i));
+	}
+}
+
 /*
  * This callback will either be called synchronously from add_memory() or
  * asynchronously (e.g., triggered via user space). We have to be careful
@@ -616,6 +663,7 @@  static int virtio_mem_memory_notifier_cb(struct notifier_block *nb,
 			break;
 		}
 		vm->hotplug_active = true;
+		virtio_mem_notify_going_offline(vm, mb_id);
 		break;
 	case MEM_GOING_ONLINE:
 		mutex_lock(&vm->hotplug_mutex);
@@ -640,6 +688,12 @@  static int virtio_mem_memory_notifier_cb(struct notifier_block *nb,
 		mutex_unlock(&vm->hotplug_mutex);
 		break;
 	case MEM_CANCEL_OFFLINE:
+		if (!vm->hotplug_active)
+			break;
+		virtio_mem_notify_cancel_offline(vm, mb_id);
+		vm->hotplug_active = false;
+		mutex_unlock(&vm->hotplug_mutex);
+		break;
 	case MEM_CANCEL_ONLINE:
 		if (!vm->hotplug_active)
 			break;
@@ -666,8 +720,11 @@  static void virtio_mem_set_fake_offline(unsigned long pfn,
 		struct page *page = pfn_to_page(pfn);
 
 		__SetPageOffline(page);
-		if (!onlined)
+		if (!onlined) {
 			SetPageDirty(page);
+			/* FIXME: remove after cleanups */
+			ClearPageReserved(page);
+		}
 	}
 }
 
@@ -1717,6 +1774,11 @@  static void virtio_mem_remove(struct virtio_device *vdev)
 		rc = virtio_mem_mb_remove(vm, mb_id);
 		BUG_ON(rc);
 	}
+	/*
+	 * After we unregistered our callbacks, user space can no longer
+	 * offline partially plugged online memory blocks. No need to worry
+	 * about them.
+	 */
 
 	/* unregister callbacks */
 	unregister_virtio_mem_device(vm);