From patchwork Tue Sep 24 14:47:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Igor Mammedov X-Patchwork-Id: 11159181 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A274914DB for ; Tue, 24 Sep 2019 15:46:10 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 82B5D20665 for ; Tue, 24 Sep 2019 15:46:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 82B5D20665 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:47398 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iCn0y-0006yf-In for patchwork-qemu-devel@patchwork.kernel.org; Tue, 24 Sep 2019 11:46:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42802) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iCm6o-0007ro-N4 for qemu-devel@nongnu.org; Tue, 24 Sep 2019 10:48:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iCm6n-0003wD-6s for qemu-devel@nongnu.org; Tue, 24 Sep 2019 10:48:06 -0400 Received: from mx1.redhat.com ([209.132.183.28]:45780) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1iCm6m-0003ve-Uw; Tue, 24 Sep 2019 10:48:05 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 41CDE7DD11; Tue, 24 Sep 2019 14:48:04 +0000 (UTC) Received: from dell-r430-03.lab.eng.brq.redhat.com (dell-r430-03.lab.eng.brq.redhat.com [10.37.153.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 953AF10013D9; Tue, 24 Sep 2019 14:48:02 +0000 (UTC) From: Igor Mammedov To: qemu-devel@nongnu.org Subject: [PATCH v7 1/4] kvm: extract kvm_log_clear_one_slot Date: Tue, 24 Sep 2019 10:47:48 -0400 Message-Id: <20190924144751.24149-2-imammedo@redhat.com> In-Reply-To: <20190924144751.24149-1-imammedo@redhat.com> References: <20190924144751.24149-1-imammedo@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Tue, 24 Sep 2019 14:48:04 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thuth@redhat.com, david@redhat.com, cohuck@redhat.com, peterx@redhat.com, borntraeger@de.ibm.com, qemu-s390x@nongnu.org, pbonzini@redhat.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: Paolo Bonzini We may need to clear the dirty bitmap for more than one KVM memslot. First do some code movement with no semantic change. Signed-off-by: Paolo Bonzini Signed-off-by: Igor Mammedov Reviewed-by: Peter Xu --- accel/kvm/kvm-all.c | 102 ++++++++++++++++++++++++-------------------- 1 file changed, 56 insertions(+), 46 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index b09bad0804..e9e6086c09 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -575,55 +575,13 @@ out: #define KVM_CLEAR_LOG_ALIGN (qemu_real_host_page_size << KVM_CLEAR_LOG_SHIFT) #define KVM_CLEAR_LOG_MASK (-KVM_CLEAR_LOG_ALIGN) -/** - * kvm_physical_log_clear - Clear the kernel's dirty bitmap for range - * - * NOTE: this will be a no-op if we haven't enabled manual dirty log - * protection in the host kernel because in that case this operation - * will be done within log_sync(). - * - * @kml: the kvm memory listener - * @section: the memory range to clear dirty bitmap - */ -static int kvm_physical_log_clear(KVMMemoryListener *kml, - MemoryRegionSection *section) +static int kvm_log_clear_one_slot(KVMSlot *mem, int as_id, uint64_t start, uint64_t size) { KVMState *s = kvm_state; + uint64_t end, bmap_start, start_delta, bmap_npages; struct kvm_clear_dirty_log d; - uint64_t start, end, bmap_start, start_delta, bmap_npages, size; unsigned long *bmap_clear = NULL, psize = qemu_real_host_page_size; - KVMSlot *mem = NULL; - int ret, i; - - if (!s->manual_dirty_log_protect) { - /* No need to do explicit clear */ - return 0; - } - - start = section->offset_within_address_space; - size = int128_get64(section->size); - - if (!size) { - /* Nothing more we can do... */ - return 0; - } - - kvm_slots_lock(kml); - - /* Find any possible slot that covers the section */ - for (i = 0; i < s->nr_slots; i++) { - mem = &kml->slots[i]; - if (mem->start_addr <= start && - start + size <= mem->start_addr + mem->memory_size) { - break; - } - } - - /* - * We should always find one memslot until this point, otherwise - * there could be something wrong from the upper layer - */ - assert(mem && i != s->nr_slots); + int ret; /* * We need to extend either the start or the size or both to @@ -694,7 +652,7 @@ static int kvm_physical_log_clear(KVMMemoryListener *kml, /* It should never overflow. If it happens, say something */ assert(bmap_npages <= UINT32_MAX); d.num_pages = bmap_npages; - d.slot = mem->slot | (kml->as_id << 16); + d.slot = mem->slot | (as_id << 16); if (kvm_vm_ioctl(s, KVM_CLEAR_DIRTY_LOG, &d) == -1) { ret = -errno; @@ -717,6 +675,58 @@ static int kvm_physical_log_clear(KVMMemoryListener *kml, size / psize); /* This handles the NULL case well */ g_free(bmap_clear); + return ret; +} + + +/** + * kvm_physical_log_clear - Clear the kernel's dirty bitmap for range + * + * NOTE: this will be a no-op if we haven't enabled manual dirty log + * protection in the host kernel because in that case this operation + * will be done within log_sync(). + * + * @kml: the kvm memory listener + * @section: the memory range to clear dirty bitmap + */ +static int kvm_physical_log_clear(KVMMemoryListener *kml, + MemoryRegionSection *section) +{ + KVMState *s = kvm_state; + uint64_t start, size; + KVMSlot *mem = NULL; + int ret, i; + + if (!s->manual_dirty_log_protect) { + /* No need to do explicit clear */ + return 0; + } + + start = section->offset_within_address_space; + size = int128_get64(section->size); + + if (!size) { + /* Nothing more we can do... */ + return 0; + } + + kvm_slots_lock(kml); + + /* Find any possible slot that covers the section */ + for (i = 0; i < s->nr_slots; i++) { + mem = &kml->slots[i]; + if (mem->start_addr <= start && + start + size <= mem->start_addr + mem->memory_size) { + break; + } + } + + /* + * We should always find one memslot until this point, otherwise + * there could be something wrong from the upper layer + */ + assert(mem && i != s->nr_slots); + ret = kvm_log_clear_one_slot(mem, kml->as_id, start, size); kvm_slots_unlock(kml); From patchwork Tue Sep 24 14:47:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Igor Mammedov X-Patchwork-Id: 11159041 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CAA2813B1 for ; Tue, 24 Sep 2019 14:50:16 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AA53A207FD for ; Tue, 24 Sep 2019 14:50:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AA53A207FD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:46580 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iCm8s-0001mY-PM for patchwork-qemu-devel@patchwork.kernel.org; Tue, 24 Sep 2019 10:50:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42849) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iCm6u-0007ww-9q for qemu-devel@nongnu.org; Tue, 24 Sep 2019 10:48:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iCm6t-0003zk-0I for qemu-devel@nongnu.org; Tue, 24 Sep 2019 10:48:12 -0400 Received: from mx1.redhat.com ([209.132.183.28]:41050) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1iCm6s-0003zM-NM; Tue, 24 Sep 2019 10:48:10 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 06A37302246D; Tue, 24 Sep 2019 14:48:10 +0000 (UTC) Received: from dell-r430-03.lab.eng.brq.redhat.com (dell-r430-03.lab.eng.brq.redhat.com [10.37.153.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8474910013D9; Tue, 24 Sep 2019 14:48:04 +0000 (UTC) From: Igor Mammedov To: qemu-devel@nongnu.org Subject: [PATCH v7 2/4] kvm: clear dirty bitmaps from all overlapping memslots Date: Tue, 24 Sep 2019 10:47:49 -0400 Message-Id: <20190924144751.24149-3-imammedo@redhat.com> In-Reply-To: <20190924144751.24149-1-imammedo@redhat.com> References: <20190924144751.24149-1-imammedo@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Tue, 24 Sep 2019 14:48:10 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thuth@redhat.com, david@redhat.com, cohuck@redhat.com, peterx@redhat.com, borntraeger@de.ibm.com, qemu-s390x@nongnu.org, pbonzini@redhat.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From: Paolo Bonzini Currently MemoryRegionSection has 1:1 mapping to KVMSlot. However next patch will allow splitting MemoryRegionSection into several KVMSlot-s, make sure that kvm_physical_log_slot_clear() is able to handle such 1:N mapping. Signed-off-by: Paolo Bonzini Signed-off-by: Igor Mammedov Reviewed-by: Peter Xu --- accel/kvm/kvm-all.c | 36 ++++++++++++++++++++++-------------- 1 file changed, 22 insertions(+), 14 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index e9e6086c09..315a91557f 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -588,8 +588,8 @@ static int kvm_log_clear_one_slot(KVMSlot *mem, int as_id, uint64_t start, uint6 * satisfy the KVM interface requirement. Firstly, do the start * page alignment on 64 host pages */ - bmap_start = (start - mem->start_addr) & KVM_CLEAR_LOG_MASK; - start_delta = start - mem->start_addr - bmap_start; + bmap_start = start & KVM_CLEAR_LOG_MASK; + start_delta = start - bmap_start; bmap_start /= psize; /* @@ -693,8 +693,8 @@ static int kvm_physical_log_clear(KVMMemoryListener *kml, MemoryRegionSection *section) { KVMState *s = kvm_state; - uint64_t start, size; - KVMSlot *mem = NULL; + uint64_t start, size, offset, count; + KVMSlot *mem; int ret, i; if (!s->manual_dirty_log_protect) { @@ -712,22 +712,30 @@ static int kvm_physical_log_clear(KVMMemoryListener *kml, kvm_slots_lock(kml); - /* Find any possible slot that covers the section */ for (i = 0; i < s->nr_slots; i++) { mem = &kml->slots[i]; - if (mem->start_addr <= start && - start + size <= mem->start_addr + mem->memory_size) { + /* Discard slots that are empty or do not overlap the section */ + if (!mem->memory_size || + mem->start_addr > start + size - 1 || + start > mem->start_addr + mem->memory_size - 1) { + continue; + } + + if (start >= mem->start_addr) { + /* The slot starts before section or is aligned to it. */ + offset = start - mem->start_addr; + count = MIN(mem->memory_size - offset, size); + } else { + /* The slot starts after section. */ + offset = 0; + count = MIN(mem->memory_size, size - (mem->start_addr - start)); + } + ret = kvm_log_clear_one_slot(mem, kml->as_id, offset, count); + if (ret < 0) { break; } } - /* - * We should always find one memslot until this point, otherwise - * there could be something wrong from the upper layer - */ - assert(mem && i != s->nr_slots); - ret = kvm_log_clear_one_slot(mem, kml->as_id, start, size); - kvm_slots_unlock(kml); return ret; From patchwork Tue Sep 24 14:47:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Igor Mammedov X-Patchwork-Id: 11159193 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 47C47912 for ; Tue, 24 Sep 2019 15:51:19 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 28242214AF for ; Tue, 24 Sep 2019 15:51:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 28242214AF Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:47482 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iCn5x-0003Xu-Gh for patchwork-qemu-devel@patchwork.kernel.org; Tue, 24 Sep 2019 11:51:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42867) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iCm6w-0007zu-GT for qemu-devel@nongnu.org; Tue, 24 Sep 2019 10:48:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iCm6u-000420-R0 for qemu-devel@nongnu.org; Tue, 24 Sep 2019 10:48:14 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37072) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1iCm6u-00041K-JU; Tue, 24 Sep 2019 10:48:12 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E4BC33CA03; Tue, 24 Sep 2019 14:48:11 +0000 (UTC) Received: from dell-r430-03.lab.eng.brq.redhat.com (dell-r430-03.lab.eng.brq.redhat.com [10.37.153.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4B92B10013D9; Tue, 24 Sep 2019 14:48:10 +0000 (UTC) From: Igor Mammedov To: qemu-devel@nongnu.org Subject: [PATCH v7 3/4] kvm: split too big memory section on several memslots Date: Tue, 24 Sep 2019 10:47:50 -0400 Message-Id: <20190924144751.24149-4-imammedo@redhat.com> In-Reply-To: <20190924144751.24149-1-imammedo@redhat.com> References: <20190924144751.24149-1-imammedo@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 24 Sep 2019 14:48:11 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thuth@redhat.com, david@redhat.com, cohuck@redhat.com, peterx@redhat.com, borntraeger@de.ibm.com, qemu-s390x@nongnu.org, pbonzini@redhat.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" Max memslot size supported by kvm on s390 is 8Tb, move logic of splitting RAM in chunks upto 8T to KVM code. This way it will hide KVM specific restrictions in KVM code and won't affect board level design decisions. Which would allow us to avoid misusing memory_region_allocate_system_memory() API and eventually use a single hostmem backend for guest RAM. Signed-off-by: Igor Mammedov Reviewed-by: Peter Xu --- v7: * rebase on top of Paolo's series "kvm: clear dirty bitmaps from all overlapping memslots" * use MIN() instead of open coding it * keep KVM_SLOT_MAX_BYTES at original place, and deal with it in the next s390 specific patch * s/slot_size/range_size/ in kvm_physical_log_clear() * use temporary MemoryRegionSection variable and keep original kvm_get_dirty_pages_log_range() untouched v6: * KVM's migration code was assuming 1:1 relation between memslot and MemorySection, which becomes not true fo s390x with this patch. As result migration was broken and dirty logging wasn't even started with when a MemorySection was split on several memslots. Amend related KVM dirty log tracking code to account for split MemorySection. v5: * move computation 'size -= slot_size' inside of loop body (David Hildenbrand ) v4: * fix compilation issue (Christian Borntraeger ) * advance HVA along with GPA in kvm_set_phys_mem() (Christian Borntraeger ) patch prepares only KVM side for switching to single RAM memory region another patch will take care of dropping manual RAM partitioning in s390 code. --- include/sysemu/kvm_int.h | 1 + accel/kvm/kvm-all.c | 124 +++++++++++++++++++++++++-------------- 2 files changed, 81 insertions(+), 44 deletions(-) diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h index 72b2d1b3ae..ac2d1f8b56 100644 --- a/include/sysemu/kvm_int.h +++ b/include/sysemu/kvm_int.h @@ -41,4 +41,5 @@ typedef struct KVMMemoryListener { void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml, AddressSpace *as, int as_id); +void kvm_set_max_memslot_size(hwaddr max_slot_size); #endif diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 315a91557f..f3848e7e75 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -140,6 +140,7 @@ bool kvm_direct_msi_allowed; bool kvm_ioeventfd_any_length_allowed; bool kvm_msi_use_devid; static bool kvm_immediate_exit; +static hwaddr kvm_max_slot_size = ~0; static const KVMCapabilityInfo kvm_required_capabilites[] = { KVM_CAP_INFO(USER_MEMORY), @@ -437,7 +438,7 @@ static int kvm_slot_update_flags(KVMMemoryListener *kml, KVMSlot *mem, static int kvm_section_update_flags(KVMMemoryListener *kml, MemoryRegionSection *section) { - hwaddr start_addr, size; + hwaddr start_addr, size, slot_size; KVMSlot *mem; int ret = 0; @@ -448,13 +449,18 @@ static int kvm_section_update_flags(KVMMemoryListener *kml, kvm_slots_lock(kml); - mem = kvm_lookup_matching_slot(kml, start_addr, size); - if (!mem) { - /* We don't have a slot if we want to trap every access. */ - goto out; - } + while (size && !ret) { + slot_size = MIN(kvm_max_slot_size, size); + mem = kvm_lookup_matching_slot(kml, start_addr, slot_size); + if (!mem) { + /* We don't have a slot if we want to trap every access. */ + goto out; + } - ret = kvm_slot_update_flags(kml, mem, section->mr); + ret = kvm_slot_update_flags(kml, mem, section->mr); + start_addr += slot_size; + size -= slot_size; + } out: kvm_slots_unlock(kml); @@ -527,11 +533,15 @@ static int kvm_physical_sync_dirty_bitmap(KVMMemoryListener *kml, struct kvm_dirty_log d = {}; KVMSlot *mem; hwaddr start_addr, size; + hwaddr slot_size, slot_offset = 0; int ret = 0; size = kvm_align_section(section, &start_addr); - if (size) { - mem = kvm_lookup_matching_slot(kml, start_addr, size); + while (size) { + MemoryRegionSection subsection = *section; + + slot_size = MIN(kvm_max_slot_size, size); + mem = kvm_lookup_matching_slot(kml, start_addr, slot_size); if (!mem) { /* We don't have a slot if we want to trap every access. */ goto out; @@ -549,11 +559,11 @@ static int kvm_physical_sync_dirty_bitmap(KVMMemoryListener *kml, * So for now, let's align to 64 instead of HOST_LONG_BITS here, in * a hope that sizeof(long) won't become >8 any time soon. */ - size = ALIGN(((mem->memory_size) >> TARGET_PAGE_BITS), - /*HOST_LONG_BITS*/ 64) / 8; if (!mem->dirty_bmap) { + hwaddr bitmap_size = ALIGN(((mem->memory_size) >> TARGET_PAGE_BITS), + /*HOST_LONG_BITS*/ 64) / 8; /* Allocate on the first log_sync, once and for all */ - mem->dirty_bmap = g_malloc0(size); + mem->dirty_bmap = g_malloc0(bitmap_size); } d.dirty_bitmap = mem->dirty_bmap; @@ -564,7 +574,13 @@ static int kvm_physical_sync_dirty_bitmap(KVMMemoryListener *kml, goto out; } - kvm_get_dirty_pages_log_range(section, d.dirty_bitmap); + subsection.offset_within_region += slot_offset; + subsection.size = int128_make64(slot_size); + kvm_get_dirty_pages_log_range(&subsection, d.dirty_bitmap); + + slot_offset += slot_size; + start_addr += slot_size; + size -= slot_size; } out: return ret; @@ -971,6 +987,14 @@ kvm_check_extension_list(KVMState *s, const KVMCapabilityInfo *list) return NULL; } +void kvm_set_max_memslot_size(hwaddr max_slot_size) +{ + g_assert( + ROUND_UP(max_slot_size, qemu_real_host_page_size) == max_slot_size + ); + kvm_max_slot_size = max_slot_size; +} + static void kvm_set_phys_mem(KVMMemoryListener *kml, MemoryRegionSection *section, bool add) { @@ -978,7 +1002,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml, int err; MemoryRegion *mr = section->mr; bool writeable = !mr->readonly && !mr->rom_device; - hwaddr start_addr, size; + hwaddr start_addr, size, slot_size; void *ram; if (!memory_region_is_ram(mr)) { @@ -1003,41 +1027,52 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml, kvm_slots_lock(kml); if (!add) { - mem = kvm_lookup_matching_slot(kml, start_addr, size); - if (!mem) { - goto out; - } - if (mem->flags & KVM_MEM_LOG_DIRTY_PAGES) { - kvm_physical_sync_dirty_bitmap(kml, section); - } + do { + slot_size = MIN(kvm_max_slot_size, size); + mem = kvm_lookup_matching_slot(kml, start_addr, slot_size); + if (!mem) { + goto out; + } + if (mem->flags & KVM_MEM_LOG_DIRTY_PAGES) { + kvm_physical_sync_dirty_bitmap(kml, section); + } - /* unregister the slot */ - g_free(mem->dirty_bmap); - mem->dirty_bmap = NULL; - mem->memory_size = 0; - mem->flags = 0; - err = kvm_set_user_memory_region(kml, mem, false); - if (err) { - fprintf(stderr, "%s: error unregistering slot: %s\n", - __func__, strerror(-err)); - abort(); - } + /* unregister the slot */ + g_free(mem->dirty_bmap); + mem->dirty_bmap = NULL; + mem->memory_size = 0; + mem->flags = 0; + err = kvm_set_user_memory_region(kml, mem, false); + if (err) { + fprintf(stderr, "%s: error unregistering slot: %s\n", + __func__, strerror(-err)); + abort(); + } + start_addr += slot_size; + size -= slot_size; + } while (size); goto out; } /* register the new slot */ - mem = kvm_alloc_slot(kml); - mem->memory_size = size; - mem->start_addr = start_addr; - mem->ram = ram; - mem->flags = kvm_mem_flags(mr); - - err = kvm_set_user_memory_region(kml, mem, true); - if (err) { - fprintf(stderr, "%s: error registering slot: %s\n", __func__, - strerror(-err)); - abort(); - } + do { + slot_size = MIN(kvm_max_slot_size, size); + mem = kvm_alloc_slot(kml); + mem->memory_size = slot_size; + mem->start_addr = start_addr; + mem->ram = ram; + mem->flags = kvm_mem_flags(mr); + + err = kvm_set_user_memory_region(kml, mem, true); + if (err) { + fprintf(stderr, "%s: error registering slot: %s\n", __func__, + strerror(-err)); + abort(); + } + start_addr += slot_size; + ram += slot_size; + size -= slot_size; + } while (size); out: kvm_slots_unlock(kml); @@ -2877,6 +2912,7 @@ static bool kvm_accel_has_memory(MachineState *ms, AddressSpace *as, for (i = 0; i < kvm->nr_as; ++i) { if (kvm->as[i].as == as && kvm->as[i].ml) { + size = MIN(kvm_max_slot_size, size); return NULL != kvm_lookup_matching_slot(kvm->as[i].ml, start_addr, size); } From patchwork Tue Sep 24 14:47:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Igor Mammedov X-Patchwork-Id: 11159057 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B97F1112B for ; Tue, 24 Sep 2019 15:02:41 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9A9A621655 for ; Tue, 24 Sep 2019 15:02:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9A9A621655 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:46734 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iCmKt-00061k-Vr for patchwork-qemu-devel@patchwork.kernel.org; Tue, 24 Sep 2019 11:02:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42885) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iCm6y-00081y-AK for qemu-devel@nongnu.org; Tue, 24 Sep 2019 10:48:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iCm6w-000439-RT for qemu-devel@nongnu.org; Tue, 24 Sep 2019 10:48:16 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46676) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1iCm6w-00042e-JP; Tue, 24 Sep 2019 10:48:14 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E3E1B99C42; Tue, 24 Sep 2019 14:48:13 +0000 (UTC) Received: from dell-r430-03.lab.eng.brq.redhat.com (dell-r430-03.lab.eng.brq.redhat.com [10.37.153.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 38DAF10013D9; Tue, 24 Sep 2019 14:48:12 +0000 (UTC) From: Igor Mammedov To: qemu-devel@nongnu.org Subject: [PATCH v7 4/4] s390: do not call memory_region_allocate_system_memory() multiple times Date: Tue, 24 Sep 2019 10:47:51 -0400 Message-Id: <20190924144751.24149-5-imammedo@redhat.com> In-Reply-To: <20190924144751.24149-1-imammedo@redhat.com> References: <20190924144751.24149-1-imammedo@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Tue, 24 Sep 2019 14:48:13 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thuth@redhat.com, david@redhat.com, cohuck@redhat.com, peterx@redhat.com, borntraeger@de.ibm.com, qemu-s390x@nongnu.org, pbonzini@redhat.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" s390 was trying to solve limited KVM memslot size issue by abusing memory_region_allocate_system_memory(), which breaks API contract where the function might be called only once. Beside an invalid use of API, the approach also introduced migration issue, since RAM chunks for each KVM_SLOT_MAX_BYTES are transferred in migration stream as separate RAMBlocks. After discussion [1], it was agreed to break migration from older QEMU for guest with RAM >8Tb (as it was relatively new (since 2.12) and considered to be not actually used downstream). Migration should keep working for guests with less than 8TB and for more than 8TB with QEMU 4.2 and newer binary. In case user tries to migrate more than 8TB guest, between incompatible QEMU versions, migration should fail gracefully due to non-exiting RAMBlock ID or RAMBlock size mismatch. Taking in account above and that now KVM code is able to split too big MemorySection into several memslots, partially revert commit (bb223055b s390-ccw-virtio: allow for systems larger that 7.999TB) and use kvm_set_max_memslot_size() to set KVMSlot size to KVM_SLOT_MAX_BYTES. 1) [PATCH RFC v2 4/4] s390: do not call memory_region_allocate_system_memory() multiple times Signed-off-by: Igor Mammedov Acked-by: Peter Xu Signed-off-by: Christian Borntraeger Acked-by: Igor Mammedov Acked-by: Paolo Bonzini --- v7: - move KVM_SLOT_MAX_BYTES movement from kvm specific patch to here (Peter Xu ) v3: - drop migration compat code PS: I don't have access to a suitable system to test it with 8Tb split, so I've tested only hacked up KVM_SLOT_MAX_BYTES = 1Gb variant --- hw/s390x/s390-virtio-ccw.c | 30 +++--------------------------- target/s390x/kvm.c | 11 +++++++++++ 2 files changed, 14 insertions(+), 27 deletions(-) diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c index 8bfb6684cb..18ad279a00 100644 --- a/hw/s390x/s390-virtio-ccw.c +++ b/hw/s390x/s390-virtio-ccw.c @@ -154,39 +154,15 @@ static void virtio_ccw_register_hcalls(void) virtio_ccw_hcall_early_printk); } -/* - * KVM does only support memory slots up to KVM_MEM_MAX_NR_PAGES pages - * as the dirty bitmap must be managed by bitops that take an int as - * position indicator. If we have a guest beyond that we will split off - * new subregions. The split must happen on a segment boundary (1MB). - */ -#define KVM_MEM_MAX_NR_PAGES ((1ULL << 31) - 1) -#define SEG_MSK (~0xfffffULL) -#define KVM_SLOT_MAX_BYTES ((KVM_MEM_MAX_NR_PAGES * TARGET_PAGE_SIZE) & SEG_MSK) static void s390_memory_init(ram_addr_t mem_size) { MemoryRegion *sysmem = get_system_memory(); - ram_addr_t chunk, offset = 0; - unsigned int number = 0; + MemoryRegion *ram = g_new(MemoryRegion, 1); Error *local_err = NULL; - gchar *name; /* allocate RAM for core */ - name = g_strdup_printf("s390.ram"); - while (mem_size) { - MemoryRegion *ram = g_new(MemoryRegion, 1); - uint64_t size = mem_size; - - /* KVM does not allow memslots >= 8 TB */ - chunk = MIN(size, KVM_SLOT_MAX_BYTES); - memory_region_allocate_system_memory(ram, NULL, name, chunk); - memory_region_add_subregion(sysmem, offset, ram); - mem_size -= chunk; - offset += chunk; - g_free(name); - name = g_strdup_printf("s390.ram.%u", ++number); - } - g_free(name); + memory_region_allocate_system_memory(ram, NULL, "s390.ram", mem_size); + memory_region_add_subregion(sysmem, 0, ram); /* * Configure the maximum page size. As no memory devices were created diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c index 97a662ad0e..54864c259c 100644 --- a/target/s390x/kvm.c +++ b/target/s390x/kvm.c @@ -28,6 +28,7 @@ #include "cpu.h" #include "internal.h" #include "kvm_s390x.h" +#include "sysemu/kvm_int.h" #include "qapi/error.h" #include "qemu/error-report.h" #include "qemu/timer.h" @@ -122,6 +123,15 @@ */ #define VCPU_IRQ_BUF_SIZE(max_cpus) (sizeof(struct kvm_s390_irq) * \ (max_cpus + NR_LOCAL_IRQS)) +/* + * KVM does only support memory slots up to KVM_MEM_MAX_NR_PAGES pages + * as the dirty bitmap must be managed by bitops that take an int as + * position indicator. If we have a guest beyond that we will split off + * new subregions. The split must happen on a segment boundary (1MB). + */ +#define KVM_MEM_MAX_NR_PAGES ((1ULL << 31) - 1) +#define SEG_MSK (~0xfffffULL) +#define KVM_SLOT_MAX_BYTES ((KVM_MEM_MAX_NR_PAGES * TARGET_PAGE_SIZE) & SEG_MSK) static CPUWatchpoint hw_watchpoint; /* @@ -355,6 +365,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s) */ /* kvm_vm_enable_cap(s, KVM_CAP_S390_AIS, 0); */ + kvm_set_max_memslot_size(KVM_SLOT_MAX_BYTES); return 0; }