From patchwork Wed Jan 4 20:44:24 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Cao, Lei" X-Patchwork-Id: 9497487 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id DD0B260413 for ; Wed, 4 Jan 2017 20:46:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C920F281D2 for ; Wed, 4 Jan 2017 20:46:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BB52C2832C; Wed, 4 Jan 2017 20:46:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 96ADD281D2 for ; Wed, 4 Jan 2017 20:46:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932790AbdADUqH (ORCPT ); Wed, 4 Jan 2017 15:46:07 -0500 Received: from us-smtp-delivery-131.mimecast.com ([63.128.21.131]:50413 "EHLO us-smtp-delivery-131.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932334AbdADUqG (ORCPT ); Wed, 4 Jan 2017 15:46:06 -0500 X-Greylist: delayed 306 seconds by postgrey-1.27 at vger.kernel.org; Wed, 04 Jan 2017 15:46:05 EST DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=StratusTechnologies.onmicrosoft.com; s=selector1-stratus-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=DXZE0cS9z598bVUhRKrj6rMNhqBOJ+ULWgNj3y3t7lM=; b=vyINIW2yGt61syiVOc5JED5LYOtFqmRXXjx84+ZkJ0UQ3Zhm/k5gyMHejuvf1MnuziL3+JNZx3OiYoE9V3265uo6Gr60yR76uku/cxQDs3RuqyvW53E3ri6NGgUzWr4E5hVbtuee596IhQDMH24AZPLg8wkAaUvSfQeXxThWCjQ= Received: from NAM03-BY2-obe.outbound.protection.outlook.com (mail-by2nam03lp0056.outbound.protection.outlook.com [216.32.180.56]) (Using TLS) by us-smtp-1.mimecast.com with ESMTP id us-mta-99-gBXtW-o5NOaBVf2kq7DkmA-1; Wed, 04 Jan 2017 15:44:27 -0500 Received: from CY1PR08MB1992.namprd08.prod.outlook.com (10.164.222.24) by CY1PR08MB1990.namprd08.prod.outlook.com (10.164.222.22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.803.11; Wed, 4 Jan 2017 20:44:25 +0000 Received: from CY1PR08MB1992.namprd08.prod.outlook.com ([10.164.222.24]) by CY1PR08MB1992.namprd08.prod.outlook.com ([10.164.222.24]) with mapi id 15.01.0803.021; Wed, 4 Jan 2017 20:44:25 +0000 From: "Cao, Lei" To: Paolo Bonzini , =?iso-8859-2?Q?Radim_Kr=E8m=E1=F8?= , "kvm@vger.kernel.org" Subject: [PATCH v2 3/4] KVM: Dirty memory tracking for performant checkpointing solutions Thread-Topic: [PATCH v2 3/4] KVM: Dirty memory tracking for performant checkpointing solutions Thread-Index: AQHSZstVkoYIFOeLyE2yug5YWpMRSA== Date: Wed, 4 Jan 2017 20:44:24 +0000 Message-ID: References: <201701041907.v04J7aVq010780@dev1.sn.stratus.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [198.97.42.5] x-ms-office365-filtering-correlation-id: 452b4eea-e027-4995-e822-08d434e27839 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001); SRVR:CY1PR08MB1990; x-microsoft-exchange-diagnostics: 1; CY1PR08MB1990; 7:wy7HOJIt0p8nBHnsJsGrY31JNyRsR44iNa0/w7lUuIP3FtaTNyOKj1KHbne36IBxkMeljGgcO+qudc3m1L4UHyuyP7cgisR1rZgmnl6nDaLFqmjJtCBKCmC4P9vFf9BvHXoT+BXiWq5WjWruJ3IAb5PQVaxTD6mTTKz6jma5goH2xdgundevI08g0LVmgMFj60TCjdQ7qwieJUjwhVqNwx70g3qaQBifdy4ZfZDpB1eVRtYRgrG5icKMg8eWOQ/J1qsXlfz+T5thgB44g8HBf8tXHnKDS4tGd2GS3ppDUg0Dl4f8iROqBvt3nQbyOlXP5kzZE8Vn0oPlkS2/zD6OHHBoNTKKGfnJ/RTuq2RLHtxlwuZ22Jds8H2vq9LdJ4WLWVTF6vmXZn9JrR/4HxJzY0h6prvO4qY3DBAaUERbuUGqkh67ZI3S4nw5BJ1YoHk/Zw4l/XhhqxUTDuVmvL4oNg==; 20:+RwBAAUPOHGK1cL3lIV0GA6fHbUuIxUwGLWCdUM7V0evO0Zn+GPoQbBD2BwIeyYGrtVqqdcYhMACR1xAD6Q7X0LD3xNZFgVVDI+K7dV2b1sGFZShRhovsGIXBwpOjGjwPhy8nY+p9mhkgC3xT7Bu8lrAwVchHIApNYW+1uifJdQ= x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040375)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(6041248)(20161123560025)(20161123555025)(20161123562025)(20161123564025)(6072148); SRVR:CY1PR08MB1990; BCL:0; PCL:0; RULEID:; SRVR:CY1PR08MB1990; x-forefront-prvs: 0177904E6B x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(6009001)(7916002)(39450400003)(199003)(189002)(3280700002)(66066001)(3660700001)(102836003)(106116001)(106356001)(68736007)(5001770100001)(9686002)(86362001)(33656002)(107886002)(25786008)(305945005)(101416001)(7736002)(74316002)(2501003)(99286003)(55016002)(77096006)(5660300001)(76176999)(92566002)(54356999)(7696004)(8936002)(81156014)(38730400001)(81166006)(6436002)(105586002)(122556002)(50986999)(2906002)(189998001)(97736004)(8676002)(3846002)(2900100001)(6116002)(6506006)(21314002)(14143004); DIR:OUT; SFP:1102; SCL:1; SRVR:CY1PR08MB1990; H:CY1PR08MB1992.namprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: stratus.com X-MS-Exchange-CrossTenant-originalarrivaltime: 04 Jan 2017 20:44:24.8965 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: de36b473-b8ad-46ff-837f-9da16b8d1b77 X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR08MB1990 X-MC-Unique: gBXtW-o5NOaBVf2kq7DkmA-1 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Implement dirty list full forcing vcpus to exit. Signed-off-by: Lei Cao --- arch/x86/include/asm/kvm_host.h | 7 +++++++ arch/x86/kvm/mmu.c | 7 +++++++ arch/x86/kvm/vmx.c | 7 +++++++ arch/x86/kvm/x86.c | 10 ++++++++++ include/linux/kvm_host.h | 1 + include/uapi/linux/kvm.h | 1 + virt/kvm/kvm_main.c | 36 ++++++++++++++++++++++++++++++++++++ 7 files changed, 69 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 6dfb14a..20a9fc8 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -75,6 +75,7 @@ #define KVM_REQ_HV_RESET 28 #define KVM_REQ_HV_EXIT 29 #define KVM_REQ_HV_STIMER 30 +#define KVM_REQ_EXIT_DIRTY_LOG_FULL 31 #define CR0_RESERVED_BITS \ (~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \ @@ -997,6 +998,8 @@ struct kvm_x86_ops { * - enable_log_dirty_pt_masked: * called when reenabling log dirty for the GFNs in the mask after * corresponding bits are cleared in slot->dirty_bitmap. + * - cpu_dirty_log_size: + * called to inquire about the size of the hardware dirty log */ void (*slot_enable_log_dirty)(struct kvm *kvm, struct kvm_memory_slot *slot); @@ -1006,6 +1009,8 @@ struct kvm_x86_ops { void (*enable_log_dirty_pt_masked)(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t offset, unsigned long mask); + int (*cpu_dirty_log_size)(void); + /* pmu operations of sub-arch */ const struct kvm_pmu_ops *pmu_ops; @@ -1388,6 +1393,8 @@ bool kvm_intr_is_single_vcpu(struct kvm *kvm, struct kvm_lapic_irq *irq, void kvm_set_msi_irq(struct kvm *kvm, struct kvm_kernel_irq_routing_entry *e, struct kvm_lapic_irq *irq); +int kvm_mt_cpu_dirty_log_size(void); + static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) { if (kvm_x86_ops->vcpu_blocking) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 7012de4..e0668a0 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -4980,6 +4980,13 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, struct kvm_memslots *slots) } } +int kvm_mt_cpu_dirty_log_size(void) +{ + if (kvm_x86_ops->cpu_dirty_log_size) + return kvm_x86_ops->cpu_dirty_log_size(); + return 0; +} + static unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) { diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index ba20b00..76f88b0 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -6729,6 +6729,7 @@ static __init int hardware_setup(void) kvm_x86_ops->slot_disable_log_dirty = NULL; kvm_x86_ops->flush_log_dirty = NULL; kvm_x86_ops->enable_log_dirty_pt_masked = NULL; + kvm_x86_ops->cpu_dirty_log_size = NULL; } if (cpu_has_vmx_preemption_timer() && enable_preemption_timer) { @@ -11503,6 +11504,11 @@ static void vmx_setup_mce(struct kvm_vcpu *vcpu) ~FEATURE_CONTROL_LMCE; } +static int vmx_cpu_dirty_log_size(void) +{ + return PML_ENTITY_NUM; +} + static struct kvm_x86_ops vmx_x86_ops __ro_after_init = { .cpu_has_kvm_support = cpu_has_kvm_support, .disabled_by_bios = vmx_disabled_by_bios, @@ -11617,6 +11623,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = { .slot_disable_log_dirty = vmx_slot_disable_log_dirty, .flush_log_dirty = vmx_flush_log_dirty, .enable_log_dirty_pt_masked = vmx_enable_log_dirty_pt_masked, + .cpu_dirty_log_size = vmx_cpu_dirty_log_size, .pre_block = vmx_pre_block, .post_block = vmx_post_block, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5707129..e2f4cee 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6714,6 +6714,16 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) */ if (kvm_check_request(KVM_REQ_HV_STIMER, vcpu)) kvm_hv_process_stimers(vcpu); + if (kvm_check_request(KVM_REQ_EXIT_DIRTY_LOG_FULL, vcpu)) { + vcpu->run->exit_reason = KVM_EXIT_DIRTY_LOG_FULL; + r = -EINTR; + if (vcpu->need_exit) { + vcpu->need_exit = false; + kvm_make_all_cpus_request(vcpu->kvm, + KVM_REQ_EXIT_DIRTY_LOG_FULL); + } + goto out; + } } /* diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 7a85b30..b7fedeb 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -283,6 +283,7 @@ struct kvm_vcpu { struct dentry *debugfs_dentry; #ifdef KVM_DIRTY_LOG_PAGE_OFFSET struct gfn_list_t *dirty_logs; + bool need_exit; #endif }; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 05332de..bacb8db 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -205,6 +205,7 @@ struct kvm_hyperv_exit { #define KVM_EXIT_S390_STSI 25 #define KVM_EXIT_IOAPIC_EOI 26 #define KVM_EXIT_HYPERV 27 +#define KVM_EXIT_DIRTY_LOG_FULL 28 /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */ diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index bff980c..00d7989 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -270,6 +270,7 @@ int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id) } vcpu->dirty_logs = page_address(page); } + vcpu->need_exit = false; #endif kvm_vcpu_set_in_spin_loop(vcpu, false); @@ -3030,6 +3031,29 @@ static long kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg) } #ifdef KVM_DIRTY_LOG_PAGE_OFFSET +static void kvm_mt_dirty_log_full(struct kvm *kvm, struct kvm_vcpu *vcpu) +{ + /* + * Request vcpu exits, but if interrupts are disabled, we have + * to defer the requests because smp_call_xxx may deadlock when + * called that way. + */ + if (vcpu && irqs_disabled()) { + kvm_make_request(KVM_REQ_EXIT_DIRTY_LOG_FULL, vcpu); + vcpu->need_exit = true; + } else { + WARN_ON(irqs_disabled()); + kvm_make_all_cpus_request(kvm, + KVM_REQ_EXIT_DIRTY_LOG_FULL); + } +} + +/* + * estimated number of pages being dirtied during vcpu exit, not counting + * hardware dirty log (PML) flush + */ +#define KVM_MT_DIRTY_PAGE_NUM_EXTRA 128 + void kvm_mt_mark_page_dirty(struct kvm *kvm, struct kvm_memory_slot *slot, struct kvm_vcpu *vcpu, gfn_t gfn) { @@ -3037,6 +3061,7 @@ void kvm_mt_mark_page_dirty(struct kvm *kvm, struct kvm_memory_slot *slot, int slot_id; u32 as_id = 0; u64 offset; + u32 extra = KVM_MT_DIRTY_PAGE_NUM_EXTRA; if (!slot || !slot->dirty_bitmap || !kvm->dirty_log_size) return; @@ -3068,6 +3093,17 @@ void kvm_mt_mark_page_dirty(struct kvm *kvm, struct kvm_memory_slot *slot, gfnlist->dirty_gfns[gfnlist->dirty_index].offset = offset; smp_wmb(); gfnlist->dirty_index++; + + /* + * more pages will be dirtied during vcpu exit, e.g. pml log + * being flushed. So allow some buffer space. + */ + if (vcpu) + extra += kvm_mt_cpu_dirty_log_size(); + + if (gfnlist->dirty_index == (kvm->max_dirty_logs - extra)) + kvm_mt_dirty_log_full(kvm, vcpu); + if (!vcpu) spin_unlock(&kvm->dirty_log_lock); }