From patchwork Mon Oct 31 00:36:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Shan X-Patchwork-Id: 13025310 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2FAAFA373D for ; Mon, 31 Oct 2022 00:39:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229648AbiJaAjE (ORCPT ); Sun, 30 Oct 2022 20:39:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59662 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229441AbiJaAjC (ORCPT ); Sun, 30 Oct 2022 20:39:02 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9F7C55FDA for ; Sun, 30 Oct 2022 17:38:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667176687; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JBWp6G7ESBqgUSjMjedexAphBYwtoS+ivslpQ3iTEvk=; b=UQmuwM9jSVjk6AUwJpkBR4FkTC0C8bxH2FHHEPRXtbUoybwWMEldH4GZ4ZDOd1KbR5h5c5 u7cKxTEbXHw3cGCJTRzbOHg18lEkNOiBUHyCdb1tIb76kabw3L7om9pB/+jjiduJQ4vHNa InGZiZCeRXlkiphCOuT1WeJgkX+1Yww= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-141-5ALPAX3ANY-cO7jb6HbSnA-1; Sun, 30 Oct 2022 20:38:04 -0400 X-MC-Unique: 5ALPAX3ANY-cO7jb6HbSnA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 59605833A06; Mon, 31 Oct 2022 00:38:03 +0000 (UTC) Received: from gshan.redhat.com (vpn2-54-151.bne.redhat.com [10.64.54.151]) by smtp.corp.redhat.com (Postfix) with ESMTPS id EECBD40C6F9F; Mon, 31 Oct 2022 00:37:55 +0000 (UTC) From: Gavin Shan To: kvmarm@lists.linux.dev Cc: kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, andrew.jones@linux.dev, ajones@ventanamicro.com, maz@kernel.org, bgardon@google.com, catalin.marinas@arm.com, dmatlack@google.com, will@kernel.org, pbonzini@redhat.com, peterx@redhat.com, oliver.upton@linux.dev, seanjc@google.com, james.morse@arm.com, shuah@kernel.org, suzuki.poulose@arm.com, alexandru.elisei@arm.com, zhenyzha@redhat.com, shan.gavin@gmail.com Subject: [PATCH v7 1/9] KVM: x86: Introduce KVM_REQ_DIRTY_RING_SOFT_FULL Date: Mon, 31 Oct 2022 08:36:13 +0800 Message-Id: <20221031003621.164306-2-gshan@redhat.com> In-Reply-To: <20221031003621.164306-1-gshan@redhat.com> References: <20221031003621.164306-1-gshan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The VCPU isn't expected to be runnable when the dirty ring becomes soft full, until the dirty pages are harvested and the dirty ring is reset from userspace. So there is a check in each guest's entrace to see if the dirty ring is soft full or not. The VCPU is stopped from running if its dirty ring has been soft full. The similar check will be needed when the feature is going to be supported on ARM64. As Marc Zyngier suggested, a new event will avoid pointless overhead to check the size of the dirty ring ('vcpu->kvm->dirty_ring_size') in each guest's entrance. Add KVM_REQ_DIRTY_RING_SOFT_FULL. The event is raised when the dirty ring becomes soft full in kvm_dirty_ring_push(). The event is cleared in the check, done in the newly added helper kvm_dirty_ring_check_request(), or when the dirty ring is reset by userspace. Since the VCPU is not runnable when the dirty ring becomes soft full, the KVM_REQ_DIRTY_RING_SOFT_FULL event is always set to prevent the VCPU from running until the dirty pages are harvested and the dirty ring is reset by userspace. kvm_dirty_ring_soft_full() becomes a private function with the newly added helper kvm_dirty_ring_check_request(). The alignment for the various event definitions in kvm_host.h is changed to tab character by the way. In order to avoid using 'container_of()', the argument @ring is replaced by @vcpu in kvm_dirty_ring_push() and kvm_dirty_ring_reset(). The argument @kvm to kvm_dirty_ring_reset() is dropped since it can be retrieved from the VCPU. Link: https://lore.kernel.org/kvmarm/87lerkwtm5.wl-maz@kernel.org Suggested-by: Marc Zyngier Signed-off-by: Gavin Shan Reviewed-by: Peter Xu Reviewed-by: Sean Christopherson --- arch/x86/kvm/x86.c | 15 ++++++--------- include/linux/kvm_dirty_ring.h | 17 ++++++----------- include/linux/kvm_host.h | 9 +++++---- virt/kvm/dirty_ring.c | 34 +++++++++++++++++++++++++++++++--- virt/kvm/kvm_main.c | 5 ++--- 5 files changed, 50 insertions(+), 30 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9cf1ba865562..d0d32e67ebf3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10499,20 +10499,17 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) bool req_immediate_exit = false; - /* Forbid vmenter if vcpu dirty ring is soft-full */ - if (unlikely(vcpu->kvm->dirty_ring_size && - kvm_dirty_ring_soft_full(&vcpu->dirty_ring))) { - vcpu->run->exit_reason = KVM_EXIT_DIRTY_RING_FULL; - trace_kvm_dirty_ring_exit(vcpu); - r = 0; - goto out; - } - if (kvm_request_pending(vcpu)) { if (kvm_check_request(KVM_REQ_VM_DEAD, vcpu)) { r = -EIO; goto out; } + + if (kvm_dirty_ring_check_request(vcpu)) { + r = 0; + goto out; + } + if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu)) { if (unlikely(!kvm_x86_ops.nested_ops->get_nested_state_pages(vcpu))) { r = 0; diff --git a/include/linux/kvm_dirty_ring.h b/include/linux/kvm_dirty_ring.h index 906f899813dc..53a36f38d15e 100644 --- a/include/linux/kvm_dirty_ring.h +++ b/include/linux/kvm_dirty_ring.h @@ -43,13 +43,12 @@ static inline int kvm_dirty_ring_alloc(struct kvm_dirty_ring *ring, return 0; } -static inline int kvm_dirty_ring_reset(struct kvm *kvm, - struct kvm_dirty_ring *ring) +static inline int kvm_dirty_ring_reset(struct kvm_vcpu *vcpu) { return 0; } -static inline void kvm_dirty_ring_push(struct kvm_dirty_ring *ring, +static inline void kvm_dirty_ring_push(struct kvm_vcpu *vcpu, u32 slot, u64 offset) { } @@ -64,11 +63,6 @@ static inline void kvm_dirty_ring_free(struct kvm_dirty_ring *ring) { } -static inline bool kvm_dirty_ring_soft_full(struct kvm_dirty_ring *ring) -{ - return true; -} - #else /* CONFIG_HAVE_KVM_DIRTY_RING */ u32 kvm_dirty_ring_get_rsvd_entries(void); @@ -78,19 +72,20 @@ int kvm_dirty_ring_alloc(struct kvm_dirty_ring *ring, int index, u32 size); * called with kvm->slots_lock held, returns the number of * processed pages. */ -int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring); +int kvm_dirty_ring_reset(struct kvm_vcpu *vcpu); /* * returns =0: successfully pushed * <0: unable to push, need to wait */ -void kvm_dirty_ring_push(struct kvm_dirty_ring *ring, u32 slot, u64 offset); +void kvm_dirty_ring_push(struct kvm_vcpu *vcpu, u32 slot, u64 offset); + +bool kvm_dirty_ring_check_request(struct kvm_vcpu *vcpu); /* for use in vm_operations_struct */ struct page *kvm_dirty_ring_get_page(struct kvm_dirty_ring *ring, u32 offset); void kvm_dirty_ring_free(struct kvm_dirty_ring *ring); -bool kvm_dirty_ring_soft_full(struct kvm_dirty_ring *ring); #endif /* CONFIG_HAVE_KVM_DIRTY_RING */ diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 00c3448ba7f8..648d663f32c4 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -153,10 +153,11 @@ static inline bool is_error_page(struct page *page) * Architecture-independent vcpu->requests bit members * Bits 3-7 are reserved for more arch-independent bits. */ -#define KVM_REQ_TLB_FLUSH (0 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) -#define KVM_REQ_VM_DEAD (1 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) -#define KVM_REQ_UNBLOCK 2 -#define KVM_REQUEST_ARCH_BASE 8 +#define KVM_REQ_TLB_FLUSH (0 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) +#define KVM_REQ_VM_DEAD (1 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) +#define KVM_REQ_UNBLOCK 2 +#define KVM_REQ_DIRTY_RING_SOFT_FULL 3 +#define KVM_REQUEST_ARCH_BASE 8 /* * KVM_REQ_OUTSIDE_GUEST_MODE exists is purely as way to force the vCPU to diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c index d6fabf238032..6091e1403bc8 100644 --- a/virt/kvm/dirty_ring.c +++ b/virt/kvm/dirty_ring.c @@ -26,7 +26,7 @@ static u32 kvm_dirty_ring_used(struct kvm_dirty_ring *ring) return READ_ONCE(ring->dirty_index) - READ_ONCE(ring->reset_index); } -bool kvm_dirty_ring_soft_full(struct kvm_dirty_ring *ring) +static bool kvm_dirty_ring_soft_full(struct kvm_dirty_ring *ring) { return kvm_dirty_ring_used(ring) >= ring->soft_limit; } @@ -87,8 +87,10 @@ static inline bool kvm_dirty_gfn_harvested(struct kvm_dirty_gfn *gfn) return smp_load_acquire(&gfn->flags) & KVM_DIRTY_GFN_F_RESET; } -int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring) +int kvm_dirty_ring_reset(struct kvm_vcpu *vcpu) { + struct kvm *kvm = vcpu->kvm; + struct kvm_dirty_ring *ring = &vcpu->dirty_ring; u32 cur_slot, next_slot; u64 cur_offset, next_offset; unsigned long mask; @@ -142,13 +144,17 @@ int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring) kvm_reset_dirty_gfn(kvm, cur_slot, cur_offset, mask); + if (!kvm_dirty_ring_soft_full(ring)) + kvm_clear_request(KVM_REQ_DIRTY_RING_SOFT_FULL, vcpu); + trace_kvm_dirty_ring_reset(ring); return count; } -void kvm_dirty_ring_push(struct kvm_dirty_ring *ring, u32 slot, u64 offset) +void kvm_dirty_ring_push(struct kvm_vcpu *vcpu, u32 slot, u64 offset) { + struct kvm_dirty_ring *ring = &vcpu->dirty_ring; struct kvm_dirty_gfn *entry; /* It should never get full */ @@ -166,6 +172,28 @@ void kvm_dirty_ring_push(struct kvm_dirty_ring *ring, u32 slot, u64 offset) kvm_dirty_gfn_set_dirtied(entry); ring->dirty_index++; trace_kvm_dirty_ring_push(ring, slot, offset); + + if (kvm_dirty_ring_soft_full(ring)) + kvm_make_request(KVM_REQ_DIRTY_RING_SOFT_FULL, vcpu); +} + +bool kvm_dirty_ring_check_request(struct kvm_vcpu *vcpu) +{ + /* + * The VCPU isn't runnable when the dirty ring becomes soft full. + * The KVM_REQ_DIRTY_RING_SOFT_FULL event is always set to prevent + * the VCPU from running until the dirty pages are harvested and + * the dirty ring is reset by userspace. + */ + if (kvm_check_request(KVM_REQ_DIRTY_RING_SOFT_FULL, vcpu) && + kvm_dirty_ring_soft_full(&vcpu->dirty_ring)) { + kvm_make_request(KVM_REQ_DIRTY_RING_SOFT_FULL, vcpu); + vcpu->run->exit_reason = KVM_EXIT_DIRTY_RING_FULL; + trace_kvm_dirty_ring_exit(vcpu); + return true; + } + + return false; } struct page *kvm_dirty_ring_get_page(struct kvm_dirty_ring *ring, u32 offset) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 1376a47fedee..30ff73931e1c 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3314,8 +3314,7 @@ void mark_page_dirty_in_slot(struct kvm *kvm, u32 slot = (memslot->as_id << 16) | memslot->id; if (kvm->dirty_ring_size) - kvm_dirty_ring_push(&vcpu->dirty_ring, - slot, rel_gfn); + kvm_dirty_ring_push(vcpu, slot, rel_gfn); else set_bit_le(rel_gfn, memslot->dirty_bitmap); } @@ -4543,7 +4542,7 @@ static int kvm_vm_ioctl_reset_dirty_pages(struct kvm *kvm) mutex_lock(&kvm->slots_lock); kvm_for_each_vcpu(i, vcpu, kvm) - cleared += kvm_dirty_ring_reset(vcpu->kvm, &vcpu->dirty_ring); + cleared += kvm_dirty_ring_reset(vcpu); mutex_unlock(&kvm->slots_lock); From patchwork Mon Oct 31 00:36:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Shan X-Patchwork-Id: 13025312 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50F75C38A02 for ; Mon, 31 Oct 2022 00:39:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229695AbiJaAjV (ORCPT ); Sun, 30 Oct 2022 20:39:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229686AbiJaAjP (ORCPT ); Sun, 30 Oct 2022 20:39:15 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0915595BC for ; Sun, 30 Oct 2022 17:38:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667176696; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=erX4o23Z0ei2lYxmuM+Ux8z75OGb9vWYbfZmMdyVy88=; b=H42uglS3oHdUQnaw2Za6s5i2A4FnTnxB/N85aU8nJ94ie7lVRnc1X+BBX7GemkI/LyLOzP K82MawONU9kwlsnh33GWk+7vMySzwCvHAKv4UkKyhbUgV7RWZoRr91ItLPyMGkHDUEyqtc xXs5Vt62b9aN9rXZqDj9t1D3MuBzSH4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-417-uwN95PzoORCvPyi7wBrbgA-1; Sun, 30 Oct 2022 20:38:11 -0400 X-MC-Unique: uwN95PzoORCvPyi7wBrbgA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CD08E811E67; Mon, 31 Oct 2022 00:38:10 +0000 (UTC) Received: from gshan.redhat.com (vpn2-54-151.bne.redhat.com [10.64.54.151]) by smtp.corp.redhat.com (Postfix) with ESMTPS id EAF3140C6F9F; Mon, 31 Oct 2022 00:38:03 +0000 (UTC) From: Gavin Shan To: kvmarm@lists.linux.dev Cc: kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, andrew.jones@linux.dev, ajones@ventanamicro.com, maz@kernel.org, bgardon@google.com, catalin.marinas@arm.com, dmatlack@google.com, will@kernel.org, pbonzini@redhat.com, peterx@redhat.com, oliver.upton@linux.dev, seanjc@google.com, james.morse@arm.com, shuah@kernel.org, suzuki.poulose@arm.com, alexandru.elisei@arm.com, zhenyzha@redhat.com, shan.gavin@gmail.com Subject: [PATCH v7 2/9] KVM: Move declaration of kvm_cpu_dirty_log_size() to kvm_dirty_ring.h Date: Mon, 31 Oct 2022 08:36:14 +0800 Message-Id: <20221031003621.164306-3-gshan@redhat.com> In-Reply-To: <20221031003621.164306-1-gshan@redhat.com> References: <20221031003621.164306-1-gshan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Not all architectures like ARM64 need to override the function. Move its declaration to kvm_dirty_ring.h to avoid the following compiling warning on ARM64 when the feature is enabled. arch/arm64/kvm/../../../virt/kvm/dirty_ring.c:14:12: \ warning: no previous prototype for 'kvm_cpu_dirty_log_size' \ [-Wmissing-prototypes] \ int __weak kvm_cpu_dirty_log_size(void) Reported-by: kernel test robot Signed-off-by: Gavin Shan Reviewed-by: Peter Xu --- arch/x86/include/asm/kvm_host.h | 2 -- include/linux/kvm_dirty_ring.h | 1 + 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 7551b6f9c31c..b4dbde7d9eb1 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -2090,8 +2090,6 @@ static inline int kvm_cpu_get_apicid(int mps_cpu) #define GET_SMSTATE(type, buf, offset) \ (*(type *)((buf) + (offset) - 0x7e00)) -int kvm_cpu_dirty_log_size(void); - int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages); #define KVM_CLOCK_VALID_FLAGS \ diff --git a/include/linux/kvm_dirty_ring.h b/include/linux/kvm_dirty_ring.h index 53a36f38d15e..04290eda0852 100644 --- a/include/linux/kvm_dirty_ring.h +++ b/include/linux/kvm_dirty_ring.h @@ -65,6 +65,7 @@ static inline void kvm_dirty_ring_free(struct kvm_dirty_ring *ring) #else /* CONFIG_HAVE_KVM_DIRTY_RING */ +int kvm_cpu_dirty_log_size(void); u32 kvm_dirty_ring_get_rsvd_entries(void); int kvm_dirty_ring_alloc(struct kvm_dirty_ring *ring, int index, u32 size); From patchwork Mon Oct 31 00:36:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Shan X-Patchwork-Id: 13025311 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27CF7C38A02 for ; Mon, 31 Oct 2022 00:39:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229679AbiJaAjR (ORCPT ); Sun, 30 Oct 2022 20:39:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59760 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229651AbiJaAjP (ORCPT ); Sun, 30 Oct 2022 20:39:15 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2491295BB for ; Sun, 30 Oct 2022 17:38:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667176703; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BHVR2QQLq5gmxvnkMA+rSKhZlYFOe38pzpUbmKo/1hk=; b=RmG/lg/y/gzT02wQG+UJTtRbhluXyFw7HeI+JEH59u22Oxkd6mjLGykpkPmRdEPou37fpe TcM+n3esXAO7guVp250ZGXhf47LQQCSWAIImZBVSgSfQxwzTYFzJp2y0owZKtuOVFvKkLy zOOZXfxw1ns6OQwKJCP82C+tJoxYesI= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-644-CqV6AEGKPfitiYbjqa1SUw-1; Sun, 30 Oct 2022 20:38:18 -0400 X-MC-Unique: CqV6AEGKPfitiYbjqa1SUw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 09FAA3C0F7E6; Mon, 31 Oct 2022 00:38:18 +0000 (UTC) Received: from gshan.redhat.com (vpn2-54-151.bne.redhat.com [10.64.54.151]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 6A87E40C6F9F; Mon, 31 Oct 2022 00:38:11 +0000 (UTC) From: Gavin Shan To: kvmarm@lists.linux.dev Cc: kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, andrew.jones@linux.dev, ajones@ventanamicro.com, maz@kernel.org, bgardon@google.com, catalin.marinas@arm.com, dmatlack@google.com, will@kernel.org, pbonzini@redhat.com, peterx@redhat.com, oliver.upton@linux.dev, seanjc@google.com, james.morse@arm.com, shuah@kernel.org, suzuki.poulose@arm.com, alexandru.elisei@arm.com, zhenyzha@redhat.com, shan.gavin@gmail.com Subject: [PATCH v7 3/9] KVM: Check KVM_CAP_DIRTY_LOG_{RING, RING_ACQ_REL} prior to enabling them Date: Mon, 31 Oct 2022 08:36:15 +0800 Message-Id: <20221031003621.164306-4-gshan@redhat.com> In-Reply-To: <20221031003621.164306-1-gshan@redhat.com> References: <20221031003621.164306-1-gshan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org There are two capabilities related to ring-based dirty page tracking: KVM_CAP_DIRTY_LOG_RING and KVM_CAP_DIRTY_LOG_RING_ACQ_REL. Both are supported by x86. However, arm64 supports KVM_CAP_DIRTY_LOG_RING_ACQ_REL only when the feature is supported on arm64. The userspace doesn't have to enable the advertised capability, meaning KVM_CAP_DIRTY_LOG_RING can be enabled on arm64 by userspace and it's wrong. Fix it by double checking if the capability has been advertised prior to enabling it. It's rejected to enable the capability if it hasn't been advertised. Fixes: 17601bfed909 ("KVM: Add KVM_CAP_DIRTY_LOG_RING_ACQ_REL capability and config option") Reported-by: Sean Christopherson Suggested-by: Sean Christopherson Signed-off-by: Gavin Shan Reviewed-by: Oliver Upton --- virt/kvm/kvm_main.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 30ff73931e1c..91cf51a25394 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4584,6 +4584,9 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm, } case KVM_CAP_DIRTY_LOG_RING: case KVM_CAP_DIRTY_LOG_RING_ACQ_REL: + if (!kvm_vm_ioctl_check_extension_generic(kvm, cap->cap)) + return -EINVAL; + return kvm_vm_ioctl_enable_dirty_log_ring(kvm, cap->args[0]); default: return kvm_vm_ioctl_enable_cap(kvm, cap); From patchwork Mon Oct 31 00:36:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Shan X-Patchwork-Id: 13025314 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6ABB3FA3743 for ; Mon, 31 Oct 2022 00:39:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229707AbiJaAje (ORCPT ); Sun, 30 Oct 2022 20:39:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59998 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229681AbiJaAja (ORCPT ); Sun, 30 Oct 2022 20:39:30 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AD69995BD for ; Sun, 30 Oct 2022 17:38:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667176710; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ujIUXWeFWzOovk7/XIyv6uEbFn+FdjaHXLz6Pf3fpbU=; b=ClPCMK09jkAqZtXCGwXda/+knFZP08huVkKa822+3saIrzbGySNGSoM3XzdQyBAyZcIJlU f6I0bY7sqX0S9ToOnQF6NGLHNk8c9Uqb8SSbd3wiEe/VtkpBSSFX8kl7+WOD0kj897+uIt qgw5P9l40Jbs+x3EqKMpjfnO5wQq6HU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-139-s9OzFTavNn6wd1ZUPRxYMQ-1; Sun, 30 Oct 2022 20:38:27 -0400 X-MC-Unique: s9OzFTavNn6wd1ZUPRxYMQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A91E785A583; Mon, 31 Oct 2022 00:38:26 +0000 (UTC) Received: from gshan.redhat.com (vpn2-54-151.bne.redhat.com [10.64.54.151]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 884AB40C6F9F; Mon, 31 Oct 2022 00:38:18 +0000 (UTC) From: Gavin Shan To: kvmarm@lists.linux.dev Cc: kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, andrew.jones@linux.dev, ajones@ventanamicro.com, maz@kernel.org, bgardon@google.com, catalin.marinas@arm.com, dmatlack@google.com, will@kernel.org, pbonzini@redhat.com, peterx@redhat.com, oliver.upton@linux.dev, seanjc@google.com, james.morse@arm.com, shuah@kernel.org, suzuki.poulose@arm.com, alexandru.elisei@arm.com, zhenyzha@redhat.com, shan.gavin@gmail.com Subject: [PATCH v7 4/9] KVM: Support dirty ring in conjunction with bitmap Date: Mon, 31 Oct 2022 08:36:16 +0800 Message-Id: <20221031003621.164306-5-gshan@redhat.com> In-Reply-To: <20221031003621.164306-1-gshan@redhat.com> References: <20221031003621.164306-1-gshan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org ARM64 needs to dirty memory outside of a VCPU context when VGIC/ITS is enabled. It's conflicting with that ring-based dirty page tracking always requires a running VCPU context. Introduce a new flavor of dirty ring that requires the use of both VCPU dirty rings and a dirty bitmap. The expectation is that for non-VCPU sources of dirty memory (such as the VGIC/ITS on arm64), KVM writes to the dirty bitmap. Userspace should scan the dirty bitmap before migrating the VM to the target. Use an additional capability to advertise this behavior. The newly added capability (KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP) can't be enabled before KVM_CAP_DIRTY_LOG_RING_ACQ_REL on ARM64. In this way, the newly added capability is treated as an extension of KVM_CAP_DIRTY_LOG_RING_ACQ_REL. Suggested-by: Marc Zyngier Suggested-by: Peter Xu Co-developed-by: Oliver Upton Signed-off-by: Oliver Upton Signed-off-by: Gavin Shan Acked-by: Peter Xu --- Documentation/virt/kvm/api.rst | 31 ++++++++++++++++++++++++------- include/linux/kvm_dirty_ring.h | 6 ++++++ include/linux/kvm_host.h | 1 + include/uapi/linux/kvm.h | 1 + virt/kvm/Kconfig | 8 ++++++++ virt/kvm/dirty_ring.c | 5 +++++ virt/kvm/kvm_main.c | 31 ++++++++++++++++++++++--------- 7 files changed, 67 insertions(+), 16 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index eee9f857a986..4d4eeb5c3c5a 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -8003,13 +8003,6 @@ flushing is done by the KVM_GET_DIRTY_LOG ioctl). To achieve that, one needs to kick the vcpu out of KVM_RUN using a signal. The resulting vmexit ensures that all dirty GFNs are flushed to the dirty rings. -NOTE: the capability KVM_CAP_DIRTY_LOG_RING and the corresponding -ioctl KVM_RESET_DIRTY_RINGS are mutual exclusive to the existing ioctls -KVM_GET_DIRTY_LOG and KVM_CLEAR_DIRTY_LOG. After enabling -KVM_CAP_DIRTY_LOG_RING with an acceptable dirty ring size, the virtual -machine will switch to ring-buffer dirty page tracking and further -KVM_GET_DIRTY_LOG or KVM_CLEAR_DIRTY_LOG ioctls will fail. - NOTE: KVM_CAP_DIRTY_LOG_RING_ACQ_REL is the only capability that should be exposed by weakly ordered architecture, in order to indicate the additional memory ordering requirements imposed on userspace when @@ -8018,6 +8011,30 @@ Architecture with TSO-like ordering (such as x86) are allowed to expose both KVM_CAP_DIRTY_LOG_RING and KVM_CAP_DIRTY_LOG_RING_ACQ_REL to userspace. +After using the dirty rings, the userspace needs to detect the capability +of KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP to see whether the ring structures +need to be backed by per-slot bitmaps. With this capability advertised +and supported, it means the architecture can dirty guest pages without +vcpu/ring context, so that some of the dirty information will still be +maintained in the bitmap structure. + +Note that the bitmap here is only a backup of the ring structure, and +normally should only contain a very small amount of dirty pages, which +needs to be transferred during VM downtime. Collecting the dirty bitmap +should be the very last thing that the VMM does before transmitting state +to the target VM. VMM needs to ensure that the dirty state is final and +avoid missing dirty pages from another ioctl ordered after the bitmap +collection. + +To collect dirty bits in the backup bitmap, the userspace can use the +same KVM_GET_DIRTY_LOG ioctl. KVM_CLEAR_DIRTY_LOG shouldn't be needed +and its behavior is undefined since collecting the dirty bitmap always +happens in the last phase of VM's migration. + +NOTE: One example of using the backup bitmap is saving arm64 vgic/its +tables through KVM_DEV_ARM_{VGIC_GRP_CTRL, ITS_SAVE_TABLES} command on +KVM device "kvm-arm-vgic-its" during VM's migration. + 8.30 KVM_CAP_XEN_HVM -------------------- diff --git a/include/linux/kvm_dirty_ring.h b/include/linux/kvm_dirty_ring.h index 04290eda0852..b08b9afd8bdb 100644 --- a/include/linux/kvm_dirty_ring.h +++ b/include/linux/kvm_dirty_ring.h @@ -37,6 +37,11 @@ static inline u32 kvm_dirty_ring_get_rsvd_entries(void) return 0; } +static inline bool kvm_use_dirty_bitmap(struct kvm *kvm) +{ + return true; +} + static inline int kvm_dirty_ring_alloc(struct kvm_dirty_ring *ring, int index, u32 size) { @@ -66,6 +71,7 @@ static inline void kvm_dirty_ring_free(struct kvm_dirty_ring *ring) #else /* CONFIG_HAVE_KVM_DIRTY_RING */ int kvm_cpu_dirty_log_size(void); +bool kvm_use_dirty_bitmap(struct kvm *kvm); u32 kvm_dirty_ring_get_rsvd_entries(void); int kvm_dirty_ring_alloc(struct kvm_dirty_ring *ring, int index, u32 size); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 648d663f32c4..db83f63f4e61 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -779,6 +779,7 @@ struct kvm { pid_t userspace_pid; unsigned int max_halt_poll_ns; u32 dirty_ring_size; + bool dirty_ring_with_bitmap; bool vm_bugged; bool vm_dead; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 0d5d4419139a..c87b5882d7ae 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1178,6 +1178,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_S390_ZPCI_OP 221 #define KVM_CAP_S390_CPU_TOPOLOGY 222 #define KVM_CAP_DIRTY_LOG_RING_ACQ_REL 223 +#define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 224 #ifdef KVM_CAP_IRQ_ROUTING diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index 800f9470e36b..228be1145cf3 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -33,6 +33,14 @@ config HAVE_KVM_DIRTY_RING_ACQ_REL bool select HAVE_KVM_DIRTY_RING +# Only architectures that need to dirty memory outside of a vCPU +# context should select this, advertising to userspace the +# requirement to use a dirty bitmap in addition to the vCPU dirty +# ring. +config HAVE_KVM_DIRTY_RING_WITH_BITMAP + bool + depends on HAVE_KVM_DIRTY_RING + config HAVE_KVM_EVENTFD bool select EVENTFD diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c index 6091e1403bc8..7ce6a5f81c98 100644 --- a/virt/kvm/dirty_ring.c +++ b/virt/kvm/dirty_ring.c @@ -21,6 +21,11 @@ u32 kvm_dirty_ring_get_rsvd_entries(void) return KVM_DIRTY_RING_RSVD_ENTRIES + kvm_cpu_dirty_log_size(); } +bool kvm_use_dirty_bitmap(struct kvm *kvm) +{ + return !kvm->dirty_ring_size || kvm->dirty_ring_with_bitmap; +} + static u32 kvm_dirty_ring_used(struct kvm_dirty_ring *ring) { return READ_ONCE(ring->dirty_index) - READ_ONCE(ring->reset_index); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 91cf51a25394..0351c8fb41b9 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1617,7 +1617,7 @@ static int kvm_prepare_memory_region(struct kvm *kvm, new->dirty_bitmap = NULL; else if (old && old->dirty_bitmap) new->dirty_bitmap = old->dirty_bitmap; - else if (!kvm->dirty_ring_size) { + else if (kvm_use_dirty_bitmap(kvm)) { r = kvm_alloc_dirty_bitmap(new); if (r) return r; @@ -2060,8 +2060,8 @@ int kvm_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log, unsigned long n; unsigned long any = 0; - /* Dirty ring tracking is exclusive to dirty log tracking */ - if (kvm->dirty_ring_size) + /* Dirty ring tracking may be exclusive to dirty log tracking */ + if (!kvm_use_dirty_bitmap(kvm)) return -ENXIO; *memslot = NULL; @@ -2125,8 +2125,8 @@ static int kvm_get_dirty_log_protect(struct kvm *kvm, struct kvm_dirty_log *log) unsigned long *dirty_bitmap_buffer; bool flush; - /* Dirty ring tracking is exclusive to dirty log tracking */ - if (kvm->dirty_ring_size) + /* Dirty ring tracking may be exclusive to dirty log tracking */ + if (!kvm_use_dirty_bitmap(kvm)) return -ENXIO; as_id = log->slot >> 16; @@ -2237,8 +2237,8 @@ static int kvm_clear_dirty_log_protect(struct kvm *kvm, unsigned long *dirty_bitmap_buffer; bool flush; - /* Dirty ring tracking is exclusive to dirty log tracking */ - if (kvm->dirty_ring_size) + /* Dirty ring tracking may be exclusive to dirty log tracking */ + if (!kvm_use_dirty_bitmap(kvm)) return -ENXIO; as_id = log->slot >> 16; @@ -3305,7 +3305,10 @@ void mark_page_dirty_in_slot(struct kvm *kvm, struct kvm_vcpu *vcpu = kvm_get_running_vcpu(); #ifdef CONFIG_HAVE_KVM_DIRTY_RING - if (WARN_ON_ONCE(!vcpu) || WARN_ON_ONCE(vcpu->kvm != kvm)) + if (WARN_ON_ONCE(vcpu && vcpu->kvm != kvm)) + return; + + if (WARN_ON_ONCE(!kvm->dirty_ring_with_bitmap && !vcpu)) return; #endif @@ -3313,7 +3316,7 @@ void mark_page_dirty_in_slot(struct kvm *kvm, unsigned long rel_gfn = gfn - memslot->base_gfn; u32 slot = (memslot->as_id << 16) | memslot->id; - if (kvm->dirty_ring_size) + if (kvm->dirty_ring_size && vcpu) kvm_dirty_ring_push(vcpu, slot, rel_gfn); else set_bit_le(rel_gfn, memslot->dirty_bitmap); @@ -4482,6 +4485,9 @@ static long kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg) return KVM_DIRTY_RING_MAX_ENTRIES * sizeof(struct kvm_dirty_gfn); #else return 0; +#endif +#ifdef CONFIG_HAVE_KVM_DIRTY_RING_WITH_BITMAP + case KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP: #endif case KVM_CAP_BINARY_STATS_FD: case KVM_CAP_SYSTEM_EVENT_DATA: @@ -4588,6 +4594,13 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm, return -EINVAL; return kvm_vm_ioctl_enable_dirty_log_ring(kvm, cap->args[0]); + case KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP: + if (!IS_ENABLED(CONFIG_HAVE_KVM_DIRTY_RING_WITH_BITMAP) || + !kvm->dirty_ring_size) + return -EINVAL; + + kvm->dirty_ring_with_bitmap = true; + return 0; default: return kvm_vm_ioctl_enable_cap(kvm, cap); } From patchwork Mon Oct 31 00:36:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Shan X-Patchwork-Id: 13025313 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15085C38A02 for ; Mon, 31 Oct 2022 00:39:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229689AbiJaAjc (ORCPT ); Sun, 30 Oct 2022 20:39:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229682AbiJaAja (ORCPT ); Sun, 30 Oct 2022 20:39:30 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D13FD95BF for ; Sun, 30 Oct 2022 17:38:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667176718; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=grdRprUQKNyQoVxwZKCiULgME4MZrUwACLwcip/PECs=; b=Bjk1ekTXbUtR6erZqAAX2i8d91VUqdc8O+gZoxQqg1AJM7/5H5BBwony27kXgSkfs9VsVo +yw9U9qGwWI9pqltH3v4qW3cJSDbZ3Iej4U9Miqn8I28eU29lwhVAp8GE/oe9KIe0CzJgO 32r1umX8GE2kc4ENDA5cDQ80vDG+ukE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-466-LN1dxmwqPrKULuYAHtY2Gg-1; Sun, 30 Oct 2022 20:38:34 -0400 X-MC-Unique: LN1dxmwqPrKULuYAHtY2Gg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 07B29833AED; Mon, 31 Oct 2022 00:38:34 +0000 (UTC) Received: from gshan.redhat.com (vpn2-54-151.bne.redhat.com [10.64.54.151]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 39BA140C6F9F; Mon, 31 Oct 2022 00:38:26 +0000 (UTC) From: Gavin Shan To: kvmarm@lists.linux.dev Cc: kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, andrew.jones@linux.dev, ajones@ventanamicro.com, maz@kernel.org, bgardon@google.com, catalin.marinas@arm.com, dmatlack@google.com, will@kernel.org, pbonzini@redhat.com, peterx@redhat.com, oliver.upton@linux.dev, seanjc@google.com, james.morse@arm.com, shuah@kernel.org, suzuki.poulose@arm.com, alexandru.elisei@arm.com, zhenyzha@redhat.com, shan.gavin@gmail.com Subject: [PATCH v7 5/9] KVM: arm64: Improve no-running-vcpu report for dirty ring Date: Mon, 31 Oct 2022 08:36:17 +0800 Message-Id: <20221031003621.164306-6-gshan@redhat.com> In-Reply-To: <20221031003621.164306-1-gshan@redhat.com> References: <20221031003621.164306-1-gshan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP should be enabled only when KVM device "kvm-arm-vgic-its" is used by userspace. Currently, it's the only case where a running VCPU is missed for dirty ring. However, there are potentially other devices introducing similar error in future. In order to report those broken devices only, the no-running-vcpu warning message is escaped from KVM device "kvm-arm-vgic-its". For this, the function vgic_has_its() needs to be exposed with a more generic function name (kvm_vgic_has_its()). Link: https://lore.kernel.org/kvmarm/Y1ghIKrAsRFwSFsO@google.com Suggested-by: Sean Christopherson Signed-off-by: Gavin Shan --- arch/arm64/kvm/mmu.c | 14 ++++++++++++++ arch/arm64/kvm/vgic/vgic-init.c | 4 ++-- arch/arm64/kvm/vgic/vgic-irqfd.c | 4 ++-- arch/arm64/kvm/vgic/vgic-its.c | 2 +- arch/arm64/kvm/vgic/vgic-mmio-v3.c | 18 ++++-------------- arch/arm64/kvm/vgic/vgic.c | 10 ++++++++++ arch/arm64/kvm/vgic/vgic.h | 1 - include/kvm/arm_vgic.h | 1 + include/linux/kvm_dirty_ring.h | 1 + virt/kvm/dirty_ring.c | 5 +++++ virt/kvm/kvm_main.c | 2 +- 11 files changed, 41 insertions(+), 21 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 60ee3d9f01f8..e0855b2b3d66 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -932,6 +932,20 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm, kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask); } +/* + * kvm_arch_allow_write_without_running_vcpu - allow writing guest memory + * without the running VCPU when dirty ring is enabled. + * + * The running VCPU is required to track dirty guest pages when dirty ring + * is enabled. Otherwise, the backup bitmap should be used to track the + * dirty guest pages. When vgic/its is enabled, we need to use the backup + * bitmap to track the dirty guest pages for it. + */ +bool kvm_arch_allow_write_without_running_vcpu(struct kvm *kvm) +{ + return kvm->dirty_ring_with_bitmap && kvm_vgic_has_its(kvm); +} + static void kvm_send_hwpoison_signal(unsigned long address, short lsb) { send_sig_mceerr(BUS_MCEERR_AR, (void __user *)address, lsb, current); diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c index f6d4f4052555..4c7f443c6d3d 100644 --- a/arch/arm64/kvm/vgic/vgic-init.c +++ b/arch/arm64/kvm/vgic/vgic-init.c @@ -296,7 +296,7 @@ int vgic_init(struct kvm *kvm) } } - if (vgic_has_its(kvm)) + if (kvm_vgic_has_its(kvm)) vgic_lpi_translation_cache_init(kvm); /* @@ -352,7 +352,7 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm) dist->vgic_cpu_base = VGIC_ADDR_UNDEF; } - if (vgic_has_its(kvm)) + if (kvm_vgic_has_its(kvm)) vgic_lpi_translation_cache_destroy(kvm); if (vgic_supports_direct_msis(kvm)) diff --git a/arch/arm64/kvm/vgic/vgic-irqfd.c b/arch/arm64/kvm/vgic/vgic-irqfd.c index 475059bacedf..e33cc34bf8f5 100644 --- a/arch/arm64/kvm/vgic/vgic-irqfd.c +++ b/arch/arm64/kvm/vgic/vgic-irqfd.c @@ -88,7 +88,7 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e, { struct kvm_msi msi; - if (!vgic_has_its(kvm)) + if (!kvm_vgic_has_its(kvm)) return -ENODEV; if (!level) @@ -112,7 +112,7 @@ int kvm_arch_set_irq_inatomic(struct kvm_kernel_irq_routing_entry *e, case KVM_IRQ_ROUTING_MSI: { struct kvm_msi msi; - if (!vgic_has_its(kvm)) + if (!kvm_vgic_has_its(kvm)) break; kvm_populate_msi(e, &msi); diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c index 733b53055f97..40622da7348a 100644 --- a/arch/arm64/kvm/vgic/vgic-its.c +++ b/arch/arm64/kvm/vgic/vgic-its.c @@ -698,7 +698,7 @@ struct vgic_its *vgic_msi_to_its(struct kvm *kvm, struct kvm_msi *msi) struct kvm_io_device *kvm_io_dev; struct vgic_io_device *iodev; - if (!vgic_has_its(kvm)) + if (!kvm_vgic_has_its(kvm)) return ERR_PTR(-ENODEV); if (!(msi->flags & KVM_MSI_VALID_DEVID)) diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c index 91201f743033..10218057c176 100644 --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c @@ -38,20 +38,10 @@ u64 update_64bit_reg(u64 reg, unsigned int offset, unsigned int len, return reg | ((u64)val << lower); } -bool vgic_has_its(struct kvm *kvm) -{ - struct vgic_dist *dist = &kvm->arch.vgic; - - if (dist->vgic_model != KVM_DEV_TYPE_ARM_VGIC_V3) - return false; - - return dist->has_its; -} - bool vgic_supports_direct_msis(struct kvm *kvm) { return (kvm_vgic_global_state.has_gicv4_1 || - (kvm_vgic_global_state.has_gicv4 && vgic_has_its(kvm))); + (kvm_vgic_global_state.has_gicv4 && kvm_vgic_has_its(kvm))); } /* @@ -78,7 +68,7 @@ static unsigned long vgic_mmio_read_v3_misc(struct kvm_vcpu *vcpu, case GICD_TYPER: value = vgic->nr_spis + VGIC_NR_PRIVATE_IRQS; value = (value >> 5) - 1; - if (vgic_has_its(vcpu->kvm)) { + if (kvm_vgic_has_its(vcpu->kvm)) { value |= (INTERRUPT_ID_BITS_ITS - 1) << 19; value |= GICD_TYPER_LPIS; } else { @@ -262,7 +252,7 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu *vcpu, struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; u32 ctlr; - if (!vgic_has_its(vcpu->kvm)) + if (!kvm_vgic_has_its(vcpu->kvm)) return; if (!(val & GICR_CTLR_ENABLE_LPIS)) { @@ -326,7 +316,7 @@ static unsigned long vgic_mmio_read_v3r_typer(struct kvm_vcpu *vcpu, value = (u64)(mpidr & GENMASK(23, 0)) << 32; value |= ((target_vcpu_id & 0xffff) << 8); - if (vgic_has_its(vcpu->kvm)) + if (kvm_vgic_has_its(vcpu->kvm)) value |= GICR_TYPER_PLPIS; if (vgic_mmio_vcpu_rdist_is_last(vcpu)) diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c index d97e6080b421..9ef7488ed0c7 100644 --- a/arch/arm64/kvm/vgic/vgic.c +++ b/arch/arm64/kvm/vgic/vgic.c @@ -21,6 +21,16 @@ struct vgic_global kvm_vgic_global_state __ro_after_init = { .gicv3_cpuif = STATIC_KEY_FALSE_INIT, }; +bool kvm_vgic_has_its(struct kvm *kvm) +{ + struct vgic_dist *dist = &kvm->arch.vgic; + + if (dist->vgic_model != KVM_DEV_TYPE_ARM_VGIC_V3) + return false; + + return dist->has_its; +} + /* * Locking order is always: * kvm->lock (mutex) diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h index 0c8da72953f0..f91114ee1cd5 100644 --- a/arch/arm64/kvm/vgic/vgic.h +++ b/arch/arm64/kvm/vgic/vgic.h @@ -235,7 +235,6 @@ void vgic_v3_load(struct kvm_vcpu *vcpu); void vgic_v3_put(struct kvm_vcpu *vcpu); void vgic_v3_vmcr_sync(struct kvm_vcpu *vcpu); -bool vgic_has_its(struct kvm *kvm); int kvm_vgic_register_its_device(void); void vgic_enable_lpis(struct kvm_vcpu *vcpu); void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu); diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h index 4df9e73a8bb5..72e9bc6c66a4 100644 --- a/include/kvm/arm_vgic.h +++ b/include/kvm/arm_vgic.h @@ -374,6 +374,7 @@ int kvm_vgic_map_resources(struct kvm *kvm); int kvm_vgic_hyp_init(void); void kvm_vgic_init_cpu_hardware(void); +bool kvm_vgic_has_its(struct kvm *kvm); int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid, bool level, void *owner); int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq, diff --git a/include/linux/kvm_dirty_ring.h b/include/linux/kvm_dirty_ring.h index b08b9afd8bdb..bb0a72401b5a 100644 --- a/include/linux/kvm_dirty_ring.h +++ b/include/linux/kvm_dirty_ring.h @@ -72,6 +72,7 @@ static inline void kvm_dirty_ring_free(struct kvm_dirty_ring *ring) int kvm_cpu_dirty_log_size(void); bool kvm_use_dirty_bitmap(struct kvm *kvm); +bool kvm_arch_allow_write_without_running_vcpu(struct kvm *kvm); u32 kvm_dirty_ring_get_rsvd_entries(void); int kvm_dirty_ring_alloc(struct kvm_dirty_ring *ring, int index, u32 size); diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c index 7ce6a5f81c98..f27e038043f3 100644 --- a/virt/kvm/dirty_ring.c +++ b/virt/kvm/dirty_ring.c @@ -26,6 +26,11 @@ bool kvm_use_dirty_bitmap(struct kvm *kvm) return !kvm->dirty_ring_size || kvm->dirty_ring_with_bitmap; } +bool __weak kvm_arch_allow_write_without_running_vcpu(struct kvm *kvm) +{ + return kvm->dirty_ring_with_bitmap; +} + static u32 kvm_dirty_ring_used(struct kvm_dirty_ring *ring) { return READ_ONCE(ring->dirty_index) - READ_ONCE(ring->reset_index); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 0351c8fb41b9..e1be4f89df3b 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3308,7 +3308,7 @@ void mark_page_dirty_in_slot(struct kvm *kvm, if (WARN_ON_ONCE(vcpu && vcpu->kvm != kvm)) return; - if (WARN_ON_ONCE(!kvm->dirty_ring_with_bitmap && !vcpu)) + if (WARN_ON_ONCE(!kvm_arch_allow_write_without_running_vcpu(kvm) && !vcpu)) return; #endif From patchwork Mon Oct 31 00:36:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Shan X-Patchwork-Id: 13025315 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC91CC38A02 for ; Mon, 31 Oct 2022 00:39:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229696AbiJaAjn (ORCPT ); Sun, 30 Oct 2022 20:39:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60050 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229681AbiJaAjj (ORCPT ); Sun, 30 Oct 2022 20:39:39 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E39F9FC5 for ; Sun, 30 Oct 2022 17:38:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667176726; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qON2RR/+3+ei4igj53NRnSplYAE/BYjfNh8P/c25EoM=; b=bQXZ/O5wuuSfhORnaaQ3uZ56FBh+xWBkxHesUVL0SBKBsyYlfn47gOxj3rQyb1HOqSDI7l cN3gA94ejxkcDsWWj8aV9t+z432tQGjOsRlhxAAIQwC2/6WAyoBUHTDdQtm5FiC9Wm32W3 dRTYpyoqkHon2UbLqUhzpbJo2vngBuo= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-49-kYeDSg5vMEudhQkncI9KZA-1; Sun, 30 Oct 2022 20:38:42 -0400 X-MC-Unique: kYeDSg5vMEudhQkncI9KZA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 239CF101A528; Mon, 31 Oct 2022 00:38:41 +0000 (UTC) Received: from gshan.redhat.com (vpn2-54-151.bne.redhat.com [10.64.54.151]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 9B22F40C947B; Mon, 31 Oct 2022 00:38:34 +0000 (UTC) From: Gavin Shan To: kvmarm@lists.linux.dev Cc: kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, andrew.jones@linux.dev, ajones@ventanamicro.com, maz@kernel.org, bgardon@google.com, catalin.marinas@arm.com, dmatlack@google.com, will@kernel.org, pbonzini@redhat.com, peterx@redhat.com, oliver.upton@linux.dev, seanjc@google.com, james.morse@arm.com, shuah@kernel.org, suzuki.poulose@arm.com, alexandru.elisei@arm.com, zhenyzha@redhat.com, shan.gavin@gmail.com Subject: [PATCH v7 6/9] KVM: arm64: Enable ring-based dirty memory tracking Date: Mon, 31 Oct 2022 08:36:18 +0800 Message-Id: <20221031003621.164306-7-gshan@redhat.com> In-Reply-To: <20221031003621.164306-1-gshan@redhat.com> References: <20221031003621.164306-1-gshan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Enable ring-based dirty memory tracking on arm64 by selecting CONFIG_HAVE_KVM_DIRTY_{RING_ACQ_REL, RING_WITH_BITMAP} and providing the ring buffer's physical page offset (KVM_DIRTY_LOG_PAGE_OFFSET). Signed-off-by: Gavin Shan --- Documentation/virt/kvm/api.rst | 2 +- arch/arm64/include/uapi/asm/kvm.h | 1 + arch/arm64/kvm/Kconfig | 2 ++ arch/arm64/kvm/arm.c | 3 +++ 4 files changed, 7 insertions(+), 1 deletion(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 4d4eeb5c3c5a..06d72bca12c9 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7921,7 +7921,7 @@ regardless of what has actually been exposed through the CPUID leaf. 8.29 KVM_CAP_DIRTY_LOG_RING/KVM_CAP_DIRTY_LOG_RING_ACQ_REL ---------------------------------------------------------- -:Architectures: x86 +:Architectures: x86, arm64 :Parameters: args[0] - size of the dirty log ring KVM is capable of tracking dirty memory using ring buffers that are diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h index 316917b98707..a7a857f1784d 100644 --- a/arch/arm64/include/uapi/asm/kvm.h +++ b/arch/arm64/include/uapi/asm/kvm.h @@ -43,6 +43,7 @@ #define __KVM_HAVE_VCPU_EVENTS #define KVM_COALESCED_MMIO_PAGE_OFFSET 1 +#define KVM_DIRTY_LOG_PAGE_OFFSET 64 #define KVM_REG_SIZE(id) \ (1U << (((id) & KVM_REG_SIZE_MASK) >> KVM_REG_SIZE_SHIFT)) diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig index 815cc118c675..066b053e9eb9 100644 --- a/arch/arm64/kvm/Kconfig +++ b/arch/arm64/kvm/Kconfig @@ -32,6 +32,8 @@ menuconfig KVM select KVM_VFIO select HAVE_KVM_EVENTFD select HAVE_KVM_IRQFD + select HAVE_KVM_DIRTY_RING_ACQ_REL + select HAVE_KVM_DIRTY_RING_WITH_BITMAP select HAVE_KVM_MSI select HAVE_KVM_IRQCHIP select HAVE_KVM_IRQ_ROUTING diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 94d33e296e10..6b097605e38c 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -746,6 +746,9 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu) if (kvm_check_request(KVM_REQ_SUSPEND, vcpu)) return kvm_vcpu_suspend(vcpu); + + if (kvm_dirty_ring_check_request(vcpu)) + return 0; } return 1; From patchwork Mon Oct 31 00:36:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Shan X-Patchwork-Id: 13025316 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4339BFA373D for ; Mon, 31 Oct 2022 00:39:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229714AbiJaAjv (ORCPT ); Sun, 30 Oct 2022 20:39:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60094 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229682AbiJaAju (ORCPT ); Sun, 30 Oct 2022 20:39:50 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E678F9FCF for ; Sun, 30 Oct 2022 17:38:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667176737; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AKNjwXr1+6lA46ix04voPDpp+Bnwo29abQuhxfN937w=; b=aRvTvSwgfbyXW53vnpTKNTohHRD5Ip8rv5hpiOdbl4hKng+/i0EjqGFLDigzF11ZdXH7tA FgDlUXNtZKyfyuwa3+eA+30wCFp/d3coikMBeTo8iOJhN3comyKNRPlawbxl+ry+9PQJeJ e3Q9Ngvtk3q/YKlJQM4oNWKwzg4FsDU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-610--_xpFcvYN8mqjWE8WR29_A-1; Sun, 30 Oct 2022 20:38:49 -0400 X-MC-Unique: -_xpFcvYN8mqjWE8WR29_A-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4D060800B30; Mon, 31 Oct 2022 00:38:48 +0000 (UTC) Received: from gshan.redhat.com (vpn2-54-151.bne.redhat.com [10.64.54.151]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B930C40C6F9F; Mon, 31 Oct 2022 00:38:41 +0000 (UTC) From: Gavin Shan To: kvmarm@lists.linux.dev Cc: kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, andrew.jones@linux.dev, ajones@ventanamicro.com, maz@kernel.org, bgardon@google.com, catalin.marinas@arm.com, dmatlack@google.com, will@kernel.org, pbonzini@redhat.com, peterx@redhat.com, oliver.upton@linux.dev, seanjc@google.com, james.morse@arm.com, shuah@kernel.org, suzuki.poulose@arm.com, alexandru.elisei@arm.com, zhenyzha@redhat.com, shan.gavin@gmail.com Subject: [PATCH v7 7/9] KVM: selftests: Use host page size to map ring buffer in dirty_log_test Date: Mon, 31 Oct 2022 08:36:19 +0800 Message-Id: <20221031003621.164306-8-gshan@redhat.com> In-Reply-To: <20221031003621.164306-1-gshan@redhat.com> References: <20221031003621.164306-1-gshan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In vcpu_map_dirty_ring(), the guest's page size is used to figure out the offset in the virtual area. It works fine when we have same page sizes on host and guest. However, it fails when the page sizes on host and guest are different on arm64, like below error messages indicates. # ./dirty_log_test -M dirty-ring -m 7 Setting log mode to: 'dirty-ring' Test iterations: 32, interval: 10 (ms) Testing guest mode: PA-bits:40, VA-bits:48, 64K pages guest physical test memory offset: 0xffbffc0000 vcpu stops because vcpu is kicked out... Notifying vcpu to continue vcpu continues now. ==== Test Assertion Failure ==== lib/kvm_util.c:1477: addr == MAP_FAILED pid=9000 tid=9000 errno=0 - Success 1 0x0000000000405f5b: vcpu_map_dirty_ring at kvm_util.c:1477 2 0x0000000000402ebb: dirty_ring_collect_dirty_pages at dirty_log_test.c:349 3 0x00000000004029b3: log_mode_collect_dirty_pages at dirty_log_test.c:478 4 (inlined by) run_test at dirty_log_test.c:778 5 (inlined by) run_test at dirty_log_test.c:691 6 0x0000000000403a57: for_each_guest_mode at guest_modes.c:105 7 0x0000000000401ccf: main at dirty_log_test.c:921 8 0x0000ffffb06ec79b: ?? ??:0 9 0x0000ffffb06ec86b: ?? ??:0 10 0x0000000000401def: _start at ??:? Dirty ring mapped private Fix the issue by using host's page size to map the ring buffer. Signed-off-by: Gavin Shan --- tools/testing/selftests/kvm/lib/kvm_util.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index f1cb1627161f..89a1a420ebd5 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -1506,7 +1506,7 @@ struct kvm_reg_list *vcpu_get_reg_list(struct kvm_vcpu *vcpu) void *vcpu_map_dirty_ring(struct kvm_vcpu *vcpu) { - uint32_t page_size = vcpu->vm->page_size; + uint32_t page_size = getpagesize(); uint32_t size = vcpu->vm->dirty_ring_size; TEST_ASSERT(size > 0, "Should enable dirty ring first"); From patchwork Mon Oct 31 00:36:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Shan X-Patchwork-Id: 13025317 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61E26FA3743 for ; Mon, 31 Oct 2022 00:39:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229682AbiJaAjw (ORCPT ); Sun, 30 Oct 2022 20:39:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229711AbiJaAju (ORCPT ); Sun, 30 Oct 2022 20:39:50 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 844FC5FB4 for ; Sun, 30 Oct 2022 17:39:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667176742; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cNQilMgvXDneE1h9HbtAPKvGFLnkNGwhukg91aqJDGY=; b=N4+3+DwwN5e2Eq5bIdIX8dwcQl11xef2pQMRn9Yc+3F+1oK/jpUR9moAMj7DxCibS9Jl5g JPRuK6Fqnui9dNjabJm5gg7la8rb0rUroXd7jlMKKKk21nZ0p2ExTTTcZpkvITMMk8ubUv kQWhrKS0Js79Wzet1m4OmaAXDKzQtv4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-592-HYLUENG8OZG3rJuR766aeA-1; Sun, 30 Oct 2022 20:38:56 -0400 X-MC-Unique: HYLUENG8OZG3rJuR766aeA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A7D58833AEE; Mon, 31 Oct 2022 00:38:55 +0000 (UTC) Received: from gshan.redhat.com (vpn2-54-151.bne.redhat.com [10.64.54.151]) by smtp.corp.redhat.com (Postfix) with ESMTPS id DF4AB40C6F75; Mon, 31 Oct 2022 00:38:48 +0000 (UTC) From: Gavin Shan To: kvmarm@lists.linux.dev Cc: kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, andrew.jones@linux.dev, ajones@ventanamicro.com, maz@kernel.org, bgardon@google.com, catalin.marinas@arm.com, dmatlack@google.com, will@kernel.org, pbonzini@redhat.com, peterx@redhat.com, oliver.upton@linux.dev, seanjc@google.com, james.morse@arm.com, shuah@kernel.org, suzuki.poulose@arm.com, alexandru.elisei@arm.com, zhenyzha@redhat.com, shan.gavin@gmail.com Subject: [PATCH v7 8/9] KVM: selftests: Clear dirty ring states between two modes in dirty_log_test Date: Mon, 31 Oct 2022 08:36:20 +0800 Message-Id: <20221031003621.164306-9-gshan@redhat.com> In-Reply-To: <20221031003621.164306-1-gshan@redhat.com> References: <20221031003621.164306-1-gshan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org There are two states, which need to be cleared before next mode is executed. Otherwise, we will hit failure as the following messages indicate. - The variable 'dirty_ring_vcpu_ring_full' shared by main and vcpu thread. It's indicating if the vcpu exit due to full ring buffer. The value can be carried from previous mode (VM_MODE_P40V48_4K) to current one (VM_MODE_P40V48_64K) when VM_MODE_P40V48_16K isn't supported. - The current ring buffer index needs to be reset before next mode (VM_MODE_P40V48_64K) is executed. Otherwise, the stale value is carried from previous mode (VM_MODE_P40V48_4K). # ./dirty_log_test -M dirty-ring Setting log mode to: 'dirty-ring' Test iterations: 32, interval: 10 (ms) Testing guest mode: PA-bits:40, VA-bits:48, 4K pages guest physical test memory offset: 0xffbfffc000 : Dirtied 995328 pages Total bits checked: dirty (1012434), clear (7114123), track_next (966700) Testing guest mode: PA-bits:40, VA-bits:48, 64K pages guest physical test memory offset: 0xffbffc0000 vcpu stops because vcpu is kicked out... vcpu continues now. Notifying vcpu to continue Iteration 1 collected 0 pages vcpu stops because dirty ring is full... vcpu continues now. vcpu stops because dirty ring is full... vcpu continues now. vcpu stops because dirty ring is full... ==== Test Assertion Failure ==== dirty_log_test.c:369: cleared == count pid=10541 tid=10541 errno=22 - Invalid argument 1 0x0000000000403087: dirty_ring_collect_dirty_pages at dirty_log_test.c:369 2 0x0000000000402a0b: log_mode_collect_dirty_pages at dirty_log_test.c:492 3 (inlined by) run_test at dirty_log_test.c:795 4 (inlined by) run_test at dirty_log_test.c:705 5 0x0000000000403a37: for_each_guest_mode at guest_modes.c:100 6 0x0000000000401ccf: main at dirty_log_test.c:938 7 0x0000ffff9ecd279b: ?? ??:0 8 0x0000ffff9ecd286b: ?? ??:0 9 0x0000000000401def: _start at ??:? Reset dirty pages (0) mismatch with collected (35566) Fix the issues by clearing 'dirty_ring_vcpu_ring_full' and the ring buffer index before next new mode is to be executed. Signed-off-by: Gavin Shan --- tools/testing/selftests/kvm/dirty_log_test.c | 27 ++++++++++++-------- 1 file changed, 17 insertions(+), 10 deletions(-) diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm/dirty_log_test.c index b5234d6efbe1..8758c10ec850 100644 --- a/tools/testing/selftests/kvm/dirty_log_test.c +++ b/tools/testing/selftests/kvm/dirty_log_test.c @@ -226,13 +226,15 @@ static void clear_log_create_vm_done(struct kvm_vm *vm) } static void dirty_log_collect_dirty_pages(struct kvm_vcpu *vcpu, int slot, - void *bitmap, uint32_t num_pages) + void *bitmap, uint32_t num_pages, + uint32_t *unused) { kvm_vm_get_dirty_log(vcpu->vm, slot, bitmap); } static void clear_log_collect_dirty_pages(struct kvm_vcpu *vcpu, int slot, - void *bitmap, uint32_t num_pages) + void *bitmap, uint32_t num_pages, + uint32_t *unused) { kvm_vm_get_dirty_log(vcpu->vm, slot, bitmap); kvm_vm_clear_dirty_log(vcpu->vm, slot, bitmap, 0, num_pages); @@ -329,10 +331,9 @@ static void dirty_ring_continue_vcpu(void) } static void dirty_ring_collect_dirty_pages(struct kvm_vcpu *vcpu, int slot, - void *bitmap, uint32_t num_pages) + void *bitmap, uint32_t num_pages, + uint32_t *ring_buf_idx) { - /* We only have one vcpu */ - static uint32_t fetch_index = 0; uint32_t count = 0, cleared; bool continued_vcpu = false; @@ -349,7 +350,8 @@ static void dirty_ring_collect_dirty_pages(struct kvm_vcpu *vcpu, int slot, /* Only have one vcpu */ count = dirty_ring_collect_one(vcpu_map_dirty_ring(vcpu), - slot, bitmap, num_pages, &fetch_index); + slot, bitmap, num_pages, + ring_buf_idx); cleared = kvm_vm_reset_dirty_ring(vcpu->vm); @@ -406,7 +408,8 @@ struct log_mode { void (*create_vm_done)(struct kvm_vm *vm); /* Hook to collect the dirty pages into the bitmap provided */ void (*collect_dirty_pages) (struct kvm_vcpu *vcpu, int slot, - void *bitmap, uint32_t num_pages); + void *bitmap, uint32_t num_pages, + uint32_t *ring_buf_idx); /* Hook to call when after each vcpu run */ void (*after_vcpu_run)(struct kvm_vcpu *vcpu, int ret, int err); void (*before_vcpu_join) (void); @@ -471,13 +474,14 @@ static void log_mode_create_vm_done(struct kvm_vm *vm) } static void log_mode_collect_dirty_pages(struct kvm_vcpu *vcpu, int slot, - void *bitmap, uint32_t num_pages) + void *bitmap, uint32_t num_pages, + uint32_t *ring_buf_idx) { struct log_mode *mode = &log_modes[host_log_mode]; TEST_ASSERT(mode->collect_dirty_pages != NULL, "collect_dirty_pages() is required for any log mode!"); - mode->collect_dirty_pages(vcpu, slot, bitmap, num_pages); + mode->collect_dirty_pages(vcpu, slot, bitmap, num_pages, ring_buf_idx); } static void log_mode_after_vcpu_run(struct kvm_vcpu *vcpu, int ret, int err) @@ -696,6 +700,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) struct kvm_vcpu *vcpu; struct kvm_vm *vm; unsigned long *bmap; + uint32_t ring_buf_idx = 0; if (!log_mode_supported()) { print_skip("Log mode '%s' not supported", @@ -771,6 +776,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) host_dirty_count = 0; host_clear_count = 0; host_track_next_count = 0; + WRITE_ONCE(dirty_ring_vcpu_ring_full, false); pthread_create(&vcpu_thread, NULL, vcpu_worker, vcpu); @@ -778,7 +784,8 @@ static void run_test(enum vm_guest_mode mode, void *arg) /* Give the vcpu thread some time to dirty some pages */ usleep(p->interval * 1000); log_mode_collect_dirty_pages(vcpu, TEST_MEM_SLOT_INDEX, - bmap, host_num_pages); + bmap, host_num_pages, + &ring_buf_idx); /* * See vcpu_sync_stop_requested definition for details on why From patchwork Mon Oct 31 00:36:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Shan X-Patchwork-Id: 13025318 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33365FA373D for ; Mon, 31 Oct 2022 00:40:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229718AbiJaAkK (ORCPT ); Sun, 30 Oct 2022 20:40:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60212 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229711AbiJaAkI (ORCPT ); Sun, 30 Oct 2022 20:40:08 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 775589FD8 for ; Sun, 30 Oct 2022 17:39:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667176748; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UeYV9gtrEMC/f2NfGupIKl3UhHTgEqpO1AGhP6jS8dg=; b=JYfu8clHAA+YOyFJWMUdB4Qm2SQz+ZTeJ3t4O46sPPRjTJvEHHgJp93BGdo4VCCMMUKudH 0xiVsd4ZCpon4P9CCaYth0TqejZn/3z1V/dWgezEbNQHETKUx3JYyv7YeyWHFTsNBYEWVH A3CGHhIFc7nRXCYvTV9URHWoXVl8zE0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-249-Ul4LwwRhM0aMhypjoAEcLQ-1; Sun, 30 Oct 2022 20:39:04 -0400 X-MC-Unique: Ul4LwwRhM0aMhypjoAEcLQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 52797185A794; Mon, 31 Oct 2022 00:39:03 +0000 (UTC) Received: from gshan.redhat.com (vpn2-54-151.bne.redhat.com [10.64.54.151]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 46E6D40C6F75; Mon, 31 Oct 2022 00:38:55 +0000 (UTC) From: Gavin Shan To: kvmarm@lists.linux.dev Cc: kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, andrew.jones@linux.dev, ajones@ventanamicro.com, maz@kernel.org, bgardon@google.com, catalin.marinas@arm.com, dmatlack@google.com, will@kernel.org, pbonzini@redhat.com, peterx@redhat.com, oliver.upton@linux.dev, seanjc@google.com, james.morse@arm.com, shuah@kernel.org, suzuki.poulose@arm.com, alexandru.elisei@arm.com, zhenyzha@redhat.com, shan.gavin@gmail.com Subject: [PATCH v7 9/9] KVM: selftests: Automate choosing dirty ring size in dirty_log_test Date: Mon, 31 Oct 2022 08:36:21 +0800 Message-Id: <20221031003621.164306-10-gshan@redhat.com> In-Reply-To: <20221031003621.164306-1-gshan@redhat.com> References: <20221031003621.164306-1-gshan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In the dirty ring case, we rely on vcpu exit due to full dirty ring state. On ARM64 system, there are 4096 host pages when the host page size is 64KB. In this case, the vcpu never exits due to the full dirty ring state. The similar case is 4KB page size on host and 64KB page size on guest. The vcpu corrupts same set of host pages, but the dirty page information isn't collected in the main thread. This leads to infinite loop as the following log shows. # ./dirty_log_test -M dirty-ring -c 65536 -m 5 Setting log mode to: 'dirty-ring' Test iterations: 32, interval: 10 (ms) Testing guest mode: PA-bits:40, VA-bits:48, 4K pages guest physical test memory offset: 0xffbffe0000 vcpu stops because vcpu is kicked out... Notifying vcpu to continue vcpu continues now. Iteration 1 collected 576 pages Fix the issue by automatically choosing the best dirty ring size, to ensure vcpu exit due to full dirty ring state. The option '-c' becomes a hint to the dirty ring count, instead of the value of it. Signed-off-by: Gavin Shan --- tools/testing/selftests/kvm/dirty_log_test.c | 26 +++++++++++++++++--- 1 file changed, 22 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm/dirty_log_test.c index 8758c10ec850..a87e5f78ebf1 100644 --- a/tools/testing/selftests/kvm/dirty_log_test.c +++ b/tools/testing/selftests/kvm/dirty_log_test.c @@ -24,6 +24,9 @@ #include "guest_modes.h" #include "processor.h" +#define DIRTY_MEM_BITS 30 /* 1G */ +#define PAGE_SHIFT_4K 12 + /* The memory slot index to track dirty pages */ #define TEST_MEM_SLOT_INDEX 1 @@ -273,6 +276,24 @@ static bool dirty_ring_supported(void) static void dirty_ring_create_vm_done(struct kvm_vm *vm) { + uint64_t pages; + uint32_t limit; + + /* + * We rely on vcpu exit due to full dirty ring state. Adjust + * the ring buffer size to ensure we're able to reach the + * full dirty ring state. + */ + pages = (1ul << (DIRTY_MEM_BITS - vm->page_shift)) + 3; + pages = vm_adjust_num_guest_pages(vm->mode, pages); + if (vm->page_size < getpagesize()) + pages = vm_num_host_pages(vm->mode, pages); + + limit = 1 << (31 - __builtin_clz(pages)); + test_dirty_ring_count = 1 << (31 - __builtin_clz(test_dirty_ring_count)); + test_dirty_ring_count = min(limit, test_dirty_ring_count); + pr_info("dirty ring count: 0x%x\n", test_dirty_ring_count); + /* * Switch to dirty ring mode after VM creation but before any * of the vcpu creation. @@ -685,9 +706,6 @@ static struct kvm_vm *create_vm(enum vm_guest_mode mode, struct kvm_vcpu **vcpu, return vm; } -#define DIRTY_MEM_BITS 30 /* 1G */ -#define PAGE_SHIFT_4K 12 - struct test_params { unsigned long iterations; unsigned long interval; @@ -830,7 +848,7 @@ static void help(char *name) printf("usage: %s [-h] [-i iterations] [-I interval] " "[-p offset] [-m mode]\n", name); puts(""); - printf(" -c: specify dirty ring size, in number of entries\n"); + printf(" -c: hint to dirty ring size, in number of entries\n"); printf(" (only useful for dirty-ring test; default: %"PRIu32")\n", TEST_DIRTY_RING_COUNT); printf(" -i: specify iteration counts (default: %"PRIu64")\n",