From patchwork Tue Feb 25 19:17:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamas K Lengyel X-Patchwork-Id: 11404471 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5945D1580 for ; Tue, 25 Feb 2020 19:19:57 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 34B212082F for ; Tue, 25 Feb 2020 19:19:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 34B212082F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j6fim-00078Q-5J; Tue, 25 Feb 2020 19:18:20 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j6fik-000785-Eo for xen-devel@lists.xenproject.org; Tue, 25 Feb 2020 19:18:18 +0000 X-Inumbo-ID: 90725402-5803-11ea-a490-bc764e2007e4 Received: from mga06.intel.com (unknown [134.134.136.31]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 90725402-5803-11ea-a490-bc764e2007e4; Tue, 25 Feb 2020 19:18:12 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Feb 2020 11:18:11 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,485,1574150400"; d="scan'208";a="237776382" Received: from tlengyel-mobl2.amr.corp.intel.com (HELO localhost.localdomain) ([10.254.187.145]) by orsmga003.jf.intel.com with ESMTP; 25 Feb 2020 11:18:08 -0800 From: Tamas K Lengyel To: xen-devel@lists.xenproject.org Date: Tue, 25 Feb 2020 11:17:55 -0800 Message-Id: <8df741964b56c10ed912f9187dcb31aae7251085.1582658216.git.tamas.lengyel@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: References: MIME-Version: 1.0 Subject: [Xen-devel] [PATCH v10 1/3] xen/mem_sharing: VM forking X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Stefano Stabellini , Tamas K Lengyel , Wei Liu , Konrad Rzeszutek Wilk , Andrew Cooper , Ian Jackson , George Dunlap , Tamas K Lengyel , Jan Beulich , Julien Grall , =?utf-8?q?Roger_Pau_Monn=C3=A9?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" VM forking is the process of creating a domain with an empty memory space and a parent domain specified from which to populate the memory when necessary. For the new domain to be functional the VM state is copied over as part of the fork operation (HVM params, hap allocation, etc). Signed-off-by: Tamas K Lengyel --- v10: setup vcpu_info pages for vCPUs in the fork if the parent has them setup pages for special HVM PFNs if the parent has them minor adjustments based on Roger's comments --- xen/arch/x86/domain.c | 11 ++ xen/arch/x86/hvm/hvm.c | 4 +- xen/arch/x86/mm/hap/hap.c | 3 +- xen/arch/x86/mm/mem_sharing.c | 287 ++++++++++++++++++++++++++++++ xen/arch/x86/mm/p2m.c | 9 +- xen/common/domain.c | 3 + xen/include/asm-x86/hap.h | 1 + xen/include/asm-x86/hvm/hvm.h | 2 + xen/include/asm-x86/mem_sharing.h | 17 ++ xen/include/public/memory.h | 5 + xen/include/xen/sched.h | 5 + 11 files changed, 342 insertions(+), 5 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index fe63c23676..1ab0ca0942 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -2203,6 +2203,17 @@ int domain_relinquish_resources(struct domain *d) ret = relinquish_shared_pages(d); if ( ret ) return ret; + + /* + * If the domain is forked, decrement the parent's pause count + * and release the domain. + */ + if ( d->parent ) + { + domain_unpause(d->parent); + put_domain(d->parent); + d->parent = NULL; + } } #endif diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index a339b36a0d..c284f3cf5f 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1915,7 +1915,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla, } #endif - /* Spurious fault? PoD and log-dirty also take this path. */ + /* Spurious fault? PoD, log-dirty and VM forking also take this path. */ if ( p2m_is_ram(p2mt) ) { rc = 1; @@ -4429,7 +4429,7 @@ static int hvm_allow_get_param(struct domain *d, return rc; } -static int hvm_get_param(struct domain *d, uint32_t index, uint64_t *value) +int hvm_get_param(struct domain *d, uint32_t index, uint64_t *value) { int rc; diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c index 3d93f3451c..c7c7ff6e99 100644 --- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -321,8 +321,7 @@ static void hap_free_p2m_page(struct domain *d, struct page_info *pg) } /* Return the size of the pool, rounded up to the nearest MB */ -static unsigned int -hap_get_allocation(struct domain *d) +unsigned int hap_get_allocation(struct domain *d) { unsigned int pg = d->arch.paging.hap.total_pages + d->arch.paging.hap.p2m_pages; diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c index 3835bc928f..8ee37e6943 100644 --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -22,6 +22,7 @@ #include #include +#include #include #include #include @@ -36,6 +37,8 @@ #include #include #include +#include +#include #include #include "mm-locks.h" @@ -1444,6 +1447,263 @@ static inline int mem_sharing_control(struct domain *d, bool enable) return 0; } +/* + * Forking a page only gets called when the VM faults due to no entry being + * in the EPT for the access. Depending on the type of access we either + * populate the physmap with a shared entry for read-only access or + * fork the page if its a write access. + * + * The client p2m is already locked so we only need to lock + * the parent's here. + */ +int mem_sharing_fork_page(struct domain *d, gfn_t gfn, bool unsharing) +{ + int rc = -ENOENT; + shr_handle_t handle; + struct domain *parent = d->parent; + struct p2m_domain *p2m; + unsigned long gfn_l = gfn_x(gfn); + mfn_t mfn, new_mfn; + p2m_type_t p2mt; + struct page_info *page; + + if ( !mem_sharing_is_fork(d) ) + return -ENOENT; + + if ( !unsharing ) + { + /* For read-only accesses we just add a shared entry to the physmap */ + while ( parent ) + { + if ( !(rc = nominate_page(parent, gfn, 0, &handle)) ) + break; + + parent = parent->parent; + } + + if ( !rc ) + { + /* The client's p2m is already locked */ + struct p2m_domain *pp2m = p2m_get_hostp2m(parent); + + p2m_lock(pp2m); + rc = add_to_physmap(parent, gfn_l, handle, d, gfn_l, false); + p2m_unlock(pp2m); + + if ( !rc ) + return 0; + } + } + + /* + * If it's a write access (ie. unsharing) or if adding a shared entry to + * the physmap failed we'll fork the page directly. + */ + p2m = p2m_get_hostp2m(d); + parent = d->parent; + + while ( parent ) + { + mfn = get_gfn_query(parent, gfn_l, &p2mt); + + /* + * We can't fork grant memory from the parent, only regular ram. + */ + if ( mfn_valid(mfn) && p2m_is_ram(p2mt) ) + break; + + put_gfn(parent, gfn_l); + parent = parent->parent; + } + + if ( !parent ) + return -ENOENT; + + if ( !(page = alloc_domheap_page(d, 0)) ) + { + put_gfn(parent, gfn_l); + return -ENOMEM; + } + + new_mfn = page_to_mfn(page); + copy_domain_page(new_mfn, mfn); + set_gpfn_from_mfn(mfn_x(new_mfn), gfn_l); + + put_gfn(parent, gfn_l); + + return p2m->set_entry(p2m, gfn, new_mfn, PAGE_ORDER_4K, p2m_ram_rw, + p2m->default_access, -1); +} + +static int bring_up_vcpus(struct domain *cd, struct domain *d) +{ + unsigned int i; + struct p2m_domain *p2m = p2m_get_hostp2m(cd); + int ret = -EINVAL; + + if ( d->max_vcpus != cd->max_vcpus ) + return ret; + + if ( (ret = cpupool_move_domain(cd, d->cpupool)) ) + return ret; + + for ( i = 0; i < cd->max_vcpus; i++ ) + { + mfn_t vcpu_info_mfn; + + if ( !d->vcpu[i] || cd->vcpu[i] ) + continue; + + if ( !vcpu_create(cd, i) ) + return -EINVAL; + + /* + * Map in a page for the vcpu_info if the guest uses one to the exact + * same spot. + */ + vcpu_info_mfn = d->vcpu[i]->vcpu_info_mfn; + if ( !mfn_eq(vcpu_info_mfn, INVALID_MFN) ) + { + struct page_info *page; + mfn_t new_mfn; + gfn_t gfn = mfn_to_gfn(d, vcpu_info_mfn); + unsigned long gfn_l = gfn_x(gfn); + + if ( !(page = alloc_domheap_page(cd, 0)) ) + return -ENOMEM; + + new_mfn = page_to_mfn(page); + set_gpfn_from_mfn(mfn_x(new_mfn), gfn_l); + + if ( !(ret = p2m->set_entry(p2m, gfn, new_mfn, PAGE_ORDER_4K, + p2m_ram_rw, p2m->default_access, -1)) ) + return ret; + + if ( !(ret = map_vcpu_info(cd->vcpu[i], gfn_l, + d->vcpu[i]->vcpu_info_offset)) ) + return ret; + } + } + + domain_update_node_affinity(cd); + return 0; +} + +static int fork_hap_allocation(struct domain *cd, struct domain *d) +{ + int rc; + bool preempted; + unsigned long mb = hap_get_allocation(d); + + if ( mb == hap_get_allocation(cd) ) + return 0; + + paging_lock(cd); + rc = hap_set_allocation(cd, mb << (20 - PAGE_SHIFT), &preempted); + paging_unlock(cd); + + return preempted ? -ERESTART : rc; +} + +static void fork_tsc(struct domain *cd, struct domain *d) +{ + uint32_t tsc_mode; + uint32_t gtsc_khz; + uint32_t incarnation; + uint64_t elapsed_nsec; + + tsc_get_info(d, &tsc_mode, &elapsed_nsec, >sc_khz, &incarnation); + /* Don't bump incarnation on set */ + tsc_set_info(cd, tsc_mode, elapsed_nsec, gtsc_khz, incarnation - 1); +} + +static int populate_special_pages(struct domain *cd) +{ + struct p2m_domain *p2m = p2m_get_hostp2m(cd); + static const unsigned int params[] = + { + HVM_PARAM_STORE_PFN, + HVM_PARAM_IOREQ_PFN, + HVM_PARAM_BUFIOREQ_PFN, + HVM_PARAM_CONSOLE_PFN + }; + unsigned int i; + + for ( i=0; i<4; i++ ) + { + uint64_t value = 0; + mfn_t new_mfn; + struct page_info *page; + + if ( hvm_get_param(cd, params[i], &value) || !value ) + continue; + + if ( !(page = alloc_domheap_page(cd, 0)) ) + return -ENOMEM; + + new_mfn = page_to_mfn(page); + set_gpfn_from_mfn(mfn_x(new_mfn), value); + + return p2m->set_entry(p2m, _gfn(value), new_mfn, PAGE_ORDER_4K, + p2m_ram_rw, p2m->default_access, -1); + } + + return 0; +} + +static int fork(struct domain *d, struct domain *cd) +{ + int rc = -EBUSY; + + if ( !cd->controller_pause_count ) + return rc; + + /* + * We only want to get and pause the parent once, not each time this + * operation is restarted due to preemption. + */ + if ( !cd->parent_paused ) + { + if ( !get_domain(d) ) + { + ASSERT_UNREACHABLE(); + return -EBUSY; + } + + domain_pause(d); + cd->parent_paused = true; + cd->max_pages = d->max_pages; + cd->max_vcpus = d->max_vcpus; + } + + /* this is preemptible so it's the first to get done */ + if ( (rc = fork_hap_allocation(cd, d)) ) + goto done; + + if ( (rc = bring_up_vcpus(cd, d)) ) + goto done; + + if ( (rc = hvm_copy_context_and_params(cd, d)) ) + goto done; + + if ( (rc = populate_special_pages(cd)) ) + goto done; + + fork_tsc(cd, d); + + cd->parent = d; + + done: + if ( rc && rc != -ERESTART ) + { + domain_unpause(d); + put_domain(d); + cd->parent_paused = false; + } + + return rc; +} + int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) { int rc; @@ -1698,6 +1958,33 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) rc = debug_gref(d, mso.u.debug.u.gref); break; + case XENMEM_sharing_op_fork: + { + struct domain *pd; + + rc = -EINVAL; + if ( mso.u.fork._pad[0] || mso.u.fork._pad[1] || + mso.u.fork._pad[2] ) + goto out; + + rc = rcu_lock_live_remote_domain_by_id(mso.u.fork.parent_domain, + &pd); + if ( rc ) + goto out; + + if ( !mem_sharing_enabled(pd) && (rc = mem_sharing_control(pd, true)) ) + goto out; + + rc = fork(pd, d); + + if ( rc == -ERESTART ) + rc = hypercall_create_continuation(__HYPERVISOR_memory_op, + "lh", XENMEM_sharing_op, + arg); + rcu_unlock_domain(pd); + break; + } + default: rc = -ENOSYS; break; diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index c5f428d67c..2358808227 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -509,6 +509,12 @@ mfn_t __get_gfn_type_access(struct p2m_domain *p2m, unsigned long gfn_l, mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order, NULL); + /* Check if we need to fork the page */ + if ( (q & P2M_ALLOC) && p2m_is_hole(*t) && + !mem_sharing_fork_page(p2m->domain, gfn, !!(q & P2M_UNSHARE)) ) + mfn = p2m->get_entry(p2m, gfn, t, a, q, page_order, NULL); + + /* Check if we need to unshare the page */ if ( (q & P2M_UNSHARE) && p2m_is_shared(*t) ) { ASSERT(p2m_is_hostp2m(p2m)); @@ -588,7 +594,8 @@ struct page_info *p2m_get_page_from_gfn( return page; /* Error path: not a suitable GFN at all */ - if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) ) + if ( !p2m_is_ram(*t) && !p2m_is_paging(*t) && !p2m_is_pod(*t) && + !mem_sharing_is_fork(p2m->domain) ) return NULL; } diff --git a/xen/common/domain.c b/xen/common/domain.c index 6ad458fa6b..02998235dd 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -1269,6 +1269,9 @@ int map_vcpu_info(struct vcpu *v, unsigned long gfn, unsigned offset) v->vcpu_info = new_info; v->vcpu_info_mfn = page_to_mfn(page); +#ifdef CONFIG_MEM_SHARING + v->vcpu_info_offset = offset; +#endif /* Set new vcpu_info pointer /before/ setting pending flags. */ smp_wmb(); diff --git a/xen/include/asm-x86/hap.h b/xen/include/asm-x86/hap.h index b94bfb4ed0..1bf07e49fe 100644 --- a/xen/include/asm-x86/hap.h +++ b/xen/include/asm-x86/hap.h @@ -45,6 +45,7 @@ int hap_track_dirty_vram(struct domain *d, extern const struct paging_mode *hap_paging_get_mode(struct vcpu *); int hap_set_allocation(struct domain *d, unsigned int pages, bool *preempted); +unsigned int hap_get_allocation(struct domain *d); #endif /* XEN_HAP_H */ diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h index 24da824cbf..35e970b030 100644 --- a/xen/include/asm-x86/hvm/hvm.h +++ b/xen/include/asm-x86/hvm/hvm.h @@ -339,6 +339,8 @@ bool hvm_flush_vcpu_tlb(bool (*flush_vcpu)(void *ctxt, struct vcpu *v), int hvm_copy_context_and_params(struct domain *src, struct domain *dst); +int hvm_get_param(struct domain *d, uint32_t index, uint64_t *value); + #ifdef CONFIG_HVM #define hvm_get_guest_tsc(v) hvm_get_guest_tsc_fixed(v, 0) diff --git a/xen/include/asm-x86/mem_sharing.h b/xen/include/asm-x86/mem_sharing.h index 53760a2896..ac968fae3f 100644 --- a/xen/include/asm-x86/mem_sharing.h +++ b/xen/include/asm-x86/mem_sharing.h @@ -39,6 +39,9 @@ struct mem_sharing_domain #define mem_sharing_enabled(d) ((d)->arch.hvm.mem_sharing.enabled) +#define mem_sharing_is_fork(d) \ + (mem_sharing_enabled(d) && !!((d)->parent)) + /* Auditing of memory sharing code? */ #ifndef NDEBUG #define MEM_SHARING_AUDIT 1 @@ -88,6 +91,9 @@ static inline int mem_sharing_unshare_page(struct domain *d, return rc; } +int mem_sharing_fork_page(struct domain *d, gfn_t gfn, + bool unsharing); + /* * If called by a foreign domain, possible errors are * -EBUSY -> ring full @@ -117,6 +123,7 @@ int relinquish_shared_pages(struct domain *d); #else #define mem_sharing_enabled(d) false +#define mem_sharing_is_fork(p2m) false static inline unsigned int mem_sharing_get_nr_saved_mfns(void) { @@ -141,6 +148,16 @@ static inline int mem_sharing_notify_enomem(struct domain *d, unsigned long gfn, return -EOPNOTSUPP; } +static inline int mem_sharing_fork(struct domain *d, struct domain *cd, bool vcpu) +{ + return -EOPNOTSUPP; +} + +static inline int mem_sharing_fork_page(struct domain *d, gfn_t gfn, bool lock) +{ + return -EOPNOTSUPP; +} + #endif #endif /* __MEM_SHARING_H__ */ diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h index 126d0ff06e..c1dbad060e 100644 --- a/xen/include/public/memory.h +++ b/xen/include/public/memory.h @@ -482,6 +482,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t); #define XENMEM_sharing_op_add_physmap 6 #define XENMEM_sharing_op_audit 7 #define XENMEM_sharing_op_range_share 8 +#define XENMEM_sharing_op_fork 9 #define XENMEM_SHARING_OP_S_HANDLE_INVALID (-10) #define XENMEM_SHARING_OP_C_HANDLE_INVALID (-9) @@ -532,6 +533,10 @@ struct xen_mem_sharing_op { uint32_t gref; /* IN: gref to debug */ } u; } debug; + struct mem_sharing_op_fork { /* OP_FORK */ + domid_t parent_domain; /* IN: parent's domain id */ + uint16_t _pad[3]; /* Must be set to 0 */ + } fork; } u; }; typedef struct xen_mem_sharing_op xen_mem_sharing_op_t; diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index 3a4f43098c..c6ba5a52a4 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -248,6 +248,9 @@ struct vcpu /* Guest-specified relocation of vcpu_info. */ mfn_t vcpu_info_mfn; +#ifdef CONFIG_MEM_SHARING + uint32_t vcpu_info_offset; +#endif struct evtchn_fifo_vcpu *evtchn_fifo; @@ -503,6 +506,8 @@ struct domain /* Memory sharing support */ #ifdef CONFIG_MEM_SHARING struct vm_event_domain *vm_event_share; + struct domain *parent; /* VM fork parent */ + bool parent_paused; #endif /* Memory paging support */ #ifdef CONFIG_HAS_MEM_PAGING From patchwork Tue Feb 25 19:17:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamas K Lengyel X-Patchwork-Id: 11404467 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 671BC14BC for ; Tue, 25 Feb 2020 19:19:55 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4D8D82082F for ; Tue, 25 Feb 2020 19:19:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4D8D82082F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j6fir-00078z-Do; Tue, 25 Feb 2020 19:18:25 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j6fip-00078m-FX for xen-devel@lists.xenproject.org; Tue, 25 Feb 2020 19:18:23 +0000 X-Inumbo-ID: 914e5e16-5803-11ea-aba8-bc764e2007e4 Received: from mga06.intel.com (unknown [134.134.136.31]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 914e5e16-5803-11ea-aba8-bc764e2007e4; Tue, 25 Feb 2020 19:18:13 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Feb 2020 11:18:12 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,485,1574150400"; d="scan'208";a="237776387" Received: from tlengyel-mobl2.amr.corp.intel.com (HELO localhost.localdomain) ([10.254.187.145]) by orsmga003.jf.intel.com with ESMTP; 25 Feb 2020 11:18:11 -0800 From: Tamas K Lengyel To: xen-devel@lists.xenproject.org Date: Tue, 25 Feb 2020 11:17:56 -0800 Message-Id: <628c5cdc73c589e45a19cc0ddb5cf972b00eb3dd.1582658216.git.tamas.lengyel@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: References: MIME-Version: 1.0 Subject: [Xen-devel] [PATCH v10 2/3] x86/mem_sharing: reset a fork X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Tamas K Lengyel , Tamas K Lengyel , Wei Liu , Konrad Rzeszutek Wilk , Andrew Cooper , Ian Jackson , George Dunlap , Stefano Stabellini , Jan Beulich , Julien Grall , =?utf-8?q?Roger_Pau_Monn=C3=A9?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Implement hypercall that allows a fork to shed all memory that got allocated for it during its execution and re-load its vCPU context from the parent VM. This allows the forked VM to reset into the same state the parent VM is in a faster way then creating a new fork would be. Measurements show about a 2x speedup during normal fuzzing operations. Performance may vary depending how much memory got allocated for the forked VM. If it has been completely deduplicated from the parent VM then creating a new fork would likely be more performant. Signed-off-by: Tamas K Lengyel --- v10: implemented hypercall continuation similar to the existing range_share op --- xen/arch/x86/mm/mem_sharing.c | 126 +++++++++++++++++++++++++++++++++- xen/include/public/memory.h | 4 ++ 2 files changed, 129 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c index 8ee37e6943..aa4358aae4 100644 --- a/xen/arch/x86/mm/mem_sharing.c +++ b/xen/arch/x86/mm/mem_sharing.c @@ -1673,7 +1673,6 @@ static int fork(struct domain *d, struct domain *cd) domain_pause(d); cd->parent_paused = true; cd->max_pages = d->max_pages; - cd->max_vcpus = d->max_vcpus; } /* this is preemptible so it's the first to get done */ @@ -1704,6 +1703,91 @@ static int fork(struct domain *d, struct domain *cd) return rc; } +/* + * The fork reset operation is intended to be used on short-lived forks only. + */ +static int fork_reset(struct domain *d, struct domain *cd, + struct mem_sharing_op_fork_reset *fr) +{ + int rc = 0; + struct p2m_domain* p2m = p2m_get_hostp2m(cd); + struct page_info *page, *tmp; + unsigned long list_position = 0, preempt_count = 0, start = fr->opaque; + + domain_pause(cd); + + page_list_for_each_safe(page, tmp, &cd->page_list) + { + p2m_type_t p2mt; + p2m_access_t p2ma; + gfn_t gfn; + mfn_t mfn; + bool shared = false; + + list_position++; + + /* Resume were we left of before preemption */ + if ( start && list_position < start ) + continue; + + mfn = page_to_mfn(page); + if ( mfn_valid(mfn) ) + { + + gfn = mfn_to_gfn(cd, mfn); + mfn = __get_gfn_type_access(p2m, gfn_x(gfn), &p2mt, &p2ma, + 0, NULL, false); + + if ( p2m_is_ram(p2mt) && !p2m_is_shared(p2mt) ) + { + /* take an extra reference, must work for a shared page */ + if( !get_page(page, cd) ) + { + ASSERT_UNREACHABLE(); + return -EINVAL; + } + + shared = true; + preempt_count += 0x10; + + /* + * Must succeed, it's a shared page that exists and + * thus its size is guaranteed to be 4k so we are not splitting + * large pages. + */ + rc = p2m->set_entry(p2m, gfn, INVALID_MFN, PAGE_ORDER_4K, + p2m_invalid, p2m_access_rwx, -1); + ASSERT(!rc); + + put_page_alloc_ref(page); + put_page(page); + } + } + + if ( !shared ) + preempt_count++; + + /* Preempt every 2MiB (shared) or 32MiB (unshared) - arbitrary. */ + if ( preempt_count >= 0x2000 ) + { + if ( hypercall_preempt_check() ) + { + rc = -ERESTART; + break; + } + preempt_count = 0; + } + } + + if ( rc ) + fr->opaque = list_position; + else if ( !(rc = hvm_copy_context_and_params(cd, d)) ) + fork_tsc(cd, d); + + domain_unpause(cd); + return rc; +} + int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) { int rc; @@ -1973,7 +2057,17 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) goto out; if ( !mem_sharing_enabled(pd) && (rc = mem_sharing_control(pd, true)) ) + { + rcu_unlock_domain(pd); goto out; + } + + rc = -EINVAL; + if ( pd->max_vcpus != d->max_vcpus ) + { + rcu_unlock_domain(pd); + goto out; + } rc = fork(pd, d); @@ -1985,6 +2079,36 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg) break; } + case XENMEM_sharing_op_fork_reset: + { + struct domain *pd; + + rc = -ENOSYS; + if ( !mem_sharing_is_fork(d) ) + goto out; + + rc = rcu_lock_live_remote_domain_by_id(d->parent->domain_id, &pd); + if ( rc ) + goto out; + + rc = fork_reset(pd, d, &mso.u.fork_reset); + + rcu_unlock_domain(pd); + + if ( rc > 0 ) + { + if ( __copy_to_guest(arg, &mso, 1) ) + rc = -EFAULT; + else + rc = hypercall_create_continuation(__HYPERVISOR_memory_op, + "lh", XENMEM_sharing_op, + arg); + } + else + mso.u.fork_reset.opaque = 0; + break; + } + default: rc = -ENOSYS; break; diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h index c1dbad060e..7ca07c01dd 100644 --- a/xen/include/public/memory.h +++ b/xen/include/public/memory.h @@ -483,6 +483,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_mem_access_op_t); #define XENMEM_sharing_op_audit 7 #define XENMEM_sharing_op_range_share 8 #define XENMEM_sharing_op_fork 9 +#define XENMEM_sharing_op_fork_reset 10 #define XENMEM_SHARING_OP_S_HANDLE_INVALID (-10) #define XENMEM_SHARING_OP_C_HANDLE_INVALID (-9) @@ -537,6 +538,9 @@ struct xen_mem_sharing_op { domid_t parent_domain; /* IN: parent's domain id */ uint16_t _pad[3]; /* Must be set to 0 */ } fork; + struct mem_sharing_op_fork_reset { /* OP_FORK_RESET */ + uint64_aligned_t opaque; /* Must be set to 0 */ + } fork_reset; } u; }; typedef struct xen_mem_sharing_op xen_mem_sharing_op_t; From patchwork Tue Feb 25 19:17:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamas K Lengyel X-Patchwork-Id: 11404469 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 00C3E14E3 for ; Tue, 25 Feb 2020 19:19:57 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C52622082F for ; Tue, 25 Feb 2020 19:19:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C52622082F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j6fij-00077z-NX; Tue, 25 Feb 2020 19:18:17 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j6fii-00077u-E3 for xen-devel@lists.xenproject.org; Tue, 25 Feb 2020 19:18:16 +0000 X-Inumbo-ID: 90bdb0dd-5803-11ea-9354-12813bfff9fa Received: from mga06.intel.com (unknown [134.134.136.31]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 90bdb0dd-5803-11ea-9354-12813bfff9fa; Tue, 25 Feb 2020 19:18:14 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Feb 2020 11:18:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,485,1574150400"; d="scan'208";a="237776400" Received: from tlengyel-mobl2.amr.corp.intel.com (HELO localhost.localdomain) ([10.254.187.145]) by orsmga003.jf.intel.com with ESMTP; 25 Feb 2020 11:18:12 -0800 From: Tamas K Lengyel To: xen-devel@lists.xenproject.org Date: Tue, 25 Feb 2020 11:17:57 -0800 Message-Id: <469220b87755c4d222af85186b992bc4ffe15fac.1582658216.git.tamas.lengyel@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: References: MIME-Version: 1.0 Subject: [Xen-devel] [PATCH v10 3/3] xen/tools: VM forking toolstack side X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Anthony PERARD , Ian Jackson , Tamas K Lengyel , Wei Liu Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Add necessary bits to implement "xl fork-vm" commands. The command allows the user to specify how to launch the device model allowing for a late-launch model in which the user can execute the fork without the device model and decide to only later launch it. Signed-off-by: Tamas K Lengyel --- v10: move code into x86 specific code so it doesn't get compiled on ARM compile tested on ARM require user to specify parent's max-vcpu setting document limitation on certain options having to be set to default --- docs/man/xl.1.pod.in | 44 +++++ tools/libxc/include/xenctrl.h | 13 ++ tools/libxc/xc_memshr.c | 22 +++ tools/libxl/libxl.h | 11 ++ tools/libxl/libxl_create.c | 361 +++++++++++++++++++--------------- tools/libxl/libxl_dm.c | 2 +- tools/libxl/libxl_dom.c | 43 +++- tools/libxl/libxl_internal.h | 7 + tools/libxl/libxl_types.idl | 1 + tools/libxl/libxl_x86.c | 41 ++++ tools/xl/Makefile | 2 +- tools/xl/xl.h | 5 + tools/xl/xl_cmdtable.c | 15 ++ tools/xl/xl_forkvm.c | 147 ++++++++++++++ tools/xl/xl_vmcontrol.c | 14 ++ 15 files changed, 562 insertions(+), 166 deletions(-) create mode 100644 tools/xl/xl_forkvm.c diff --git a/docs/man/xl.1.pod.in b/docs/man/xl.1.pod.in index 09339282e6..59c03c6427 100644 --- a/docs/man/xl.1.pod.in +++ b/docs/man/xl.1.pod.in @@ -708,6 +708,50 @@ above). =back +=item B [I] I + +Create a fork of a running VM. The domain will be paused after the operation +and remains paused while forks of it exist. Experimental and x86 only. +Forks can only be made of domains with HAP enabled and on Intel hardware. The +parent domain must be created with the xl toolstack and its configuration must +not manually define max_grant_frames, max_maptrack_frames or max_event_channels. + +B + +=over 4 + +=item B<-p> + +Leave the fork paused after creating it. + +=item B<--launch-dm> + +Specify whether the device model (QEMU) should be launched for the fork. Late +launch allows to start the device model for an already running fork. + +=item B<-C> + +The config file to use when launching the device model. Currently required when +launching the device model. Most config settings MUST match the parent domain +exactly, only change VM name, disk path and network configurations. + +=item B<-Q> + +The path to the qemu save file to use when launching the device model. Currently +required when launching the device model. + +=item B<--fork-reset> + +Perform a reset operation of an already running fork. Note that resetting may +be less performant then creating a new fork depending on how much memory the +fork has deduplicated during its runtime. + +=item B<--max-vcpus> + +Specify the max-vcpus matching the parent domain when not launching the dm. + +=back + =item B [I] Display the number of shared pages for a specified domain. If no domain is diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h index 99552a5f73..90fce83196 100644 --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h @@ -2225,6 +2225,19 @@ int xc_memshr_range_share(xc_interface *xch, uint64_t first_gfn, uint64_t last_gfn); +int xc_memshr_fork(xc_interface *xch, + uint32_t source_domain, + uint32_t client_domain); + +/* + * Note: this function is only intended to be used on short-lived forks that + * haven't yet aquired a lot of memory. In case the fork has a lot of memory + * it is likely more performant to create a new fork with xc_memshr_fork. + * + * With VMs that have a lot of memory this call may block for a long time. + */ +int xc_memshr_fork_reset(xc_interface *xch, uint32_t forked_domain); + /* Debug calls: return the number of pages referencing the shared frame backing * the input argument. Should be one or greater. * diff --git a/tools/libxc/xc_memshr.c b/tools/libxc/xc_memshr.c index 97e2e6a8d9..d0e4ee225b 100644 --- a/tools/libxc/xc_memshr.c +++ b/tools/libxc/xc_memshr.c @@ -239,6 +239,28 @@ int xc_memshr_debug_gref(xc_interface *xch, return xc_memshr_memop(xch, domid, &mso); } +int xc_memshr_fork(xc_interface *xch, uint32_t pdomid, uint32_t domid) +{ + xen_mem_sharing_op_t mso; + + memset(&mso, 0, sizeof(mso)); + + mso.op = XENMEM_sharing_op_fork; + mso.u.fork.parent_domain = pdomid; + + return xc_memshr_memop(xch, domid, &mso); +} + +int xc_memshr_fork_reset(xc_interface *xch, uint32_t domid) +{ + xen_mem_sharing_op_t mso; + + memset(&mso, 0, sizeof(mso)); + mso.op = XENMEM_sharing_op_fork_reset; + + return xc_memshr_memop(xch, domid, &mso); +} + int xc_memshr_audit(xc_interface *xch) { xen_mem_sharing_op_t mso; diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 35e13428b2..6b968bdcb4 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -2657,6 +2657,17 @@ int libxl_psr_get_hw_info(libxl_ctx *ctx, libxl_psr_feat_type type, unsigned int lvl, unsigned int *nr, libxl_psr_hw_info **info); void libxl_psr_hw_info_list_free(libxl_psr_hw_info *list, unsigned int nr); + +int libxl_domain_fork_vm(libxl_ctx *ctx, uint32_t pdomid, uint32_t max_vcpus, uint32_t *domid) + LIBXL_EXTERNAL_CALLERS_ONLY; + +int libxl_domain_fork_launch_dm(libxl_ctx *ctx, libxl_domain_config *d_config, + uint32_t domid, + const libxl_asyncprogress_how *aop_console_how) + LIBXL_EXTERNAL_CALLERS_ONLY; + +int libxl_domain_fork_reset(libxl_ctx *ctx, uint32_t domid) + LIBXL_EXTERNAL_CALLERS_ONLY; #endif /* misc */ diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c index ccc9e70990..eba8ce419b 100644 --- a/tools/libxl/libxl_create.c +++ b/tools/libxl/libxl_create.c @@ -536,12 +536,12 @@ out: return ret; } -int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config, - libxl__domain_build_state *state, - uint32_t *domid, bool soft_reset) +static int libxl__domain_make_xs_entries(libxl__gc *gc, libxl_domain_config *d_config, + libxl__domain_build_state *state, + uint32_t domid) { libxl_ctx *ctx = libxl__gc_owner(gc); - int ret, rc, nb_vm; + int rc, nb_vm; const char *dom_type; char *uuid_string; char *dom_path, *vm_path, *libxl_path; @@ -553,9 +553,6 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config, /* convenience aliases */ libxl_domain_create_info *info = &d_config->c_info; - libxl_domain_build_info *b_info = &d_config->b_info; - - assert(soft_reset || *domid == INVALID_DOMID); uuid_string = libxl__uuid2string(gc, info->uuid); if (!uuid_string) { @@ -563,137 +560,7 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config, goto out; } - if (!soft_reset) { - struct xen_domctl_createdomain create = { - .ssidref = info->ssidref, - .max_vcpus = b_info->max_vcpus, - .max_evtchn_port = b_info->event_channels, - .max_grant_frames = b_info->max_grant_frames, - .max_maptrack_frames = b_info->max_maptrack_frames, - }; - - if (info->type != LIBXL_DOMAIN_TYPE_PV) { - create.flags |= XEN_DOMCTL_CDF_hvm; - create.flags |= - libxl_defbool_val(info->hap) ? XEN_DOMCTL_CDF_hap : 0; - create.flags |= - libxl_defbool_val(info->oos) ? 0 : XEN_DOMCTL_CDF_oos_off; - } - - assert(info->passthrough != LIBXL_PASSTHROUGH_DEFAULT); - LOG(DETAIL, "passthrough: %s", - libxl_passthrough_to_string(info->passthrough)); - - if (info->passthrough != LIBXL_PASSTHROUGH_DISABLED) - create.flags |= XEN_DOMCTL_CDF_iommu; - - if (info->passthrough == LIBXL_PASSTHROUGH_SYNC_PT) - create.iommu_opts |= XEN_DOMCTL_IOMMU_no_sharept; - - /* Ultimately, handle is an array of 16 uint8_t, same as uuid */ - libxl_uuid_copy(ctx, (libxl_uuid *)&create.handle, &info->uuid); - - ret = libxl__arch_domain_prepare_config(gc, d_config, &create); - if (ret < 0) { - LOGED(ERROR, *domid, "fail to get domain config"); - rc = ERROR_FAIL; - goto out; - } - - for (;;) { - uint32_t local_domid; - bool recent; - - if (info->domid == RANDOM_DOMID) { - uint16_t v; - - ret = libxl__random_bytes(gc, (void *)&v, sizeof(v)); - if (ret < 0) - break; - - v &= DOMID_MASK; - if (!libxl_domid_valid_guest(v)) - continue; - - local_domid = v; - } else { - local_domid = info->domid; /* May not be valid */ - } - - ret = xc_domain_create(ctx->xch, &local_domid, &create); - if (ret < 0) { - /* - * If we generated a random domid and creation failed - * because that domid already exists then simply try - * again. - */ - if (errno == EEXIST && info->domid == RANDOM_DOMID) - continue; - - LOGED(ERROR, local_domid, "domain creation fail"); - rc = ERROR_FAIL; - goto out; - } - - /* A new domain now exists */ - *domid = local_domid; - - rc = libxl__is_domid_recent(gc, local_domid, &recent); - if (rc) - goto out; - - /* The domid is not recent, so we're done */ - if (!recent) - break; - - /* - * If the domid was specified then there's no point in - * trying again. - */ - if (libxl_domid_valid_guest(info->domid)) { - LOGED(ERROR, local_domid, "domain id recently used"); - rc = ERROR_FAIL; - goto out; - } - - /* - * The domain is recent and so cannot be used. Clear domid - * here since, if xc_domain_destroy() fails below there is - * little point calling it again in the error path. - */ - *domid = INVALID_DOMID; - - ret = xc_domain_destroy(ctx->xch, local_domid); - if (ret < 0) { - LOGED(ERROR, local_domid, "domain destroy fail"); - rc = ERROR_FAIL; - goto out; - } - - /* The domain was successfully destroyed, so we can try again */ - } - - rc = libxl__arch_domain_save_config(gc, d_config, state, &create); - if (rc < 0) - goto out; - } - - /* - * If soft_reset is set the the domid will have been valid on entry. - * If it was not set then xc_domain_create() should have assigned a - * valid value. Either way, if we reach this point, domid should be - * valid. - */ - assert(libxl_domid_valid_guest(*domid)); - - ret = xc_cpupool_movedomain(ctx->xch, info->poolid, *domid); - if (ret < 0) { - LOGED(ERROR, *domid, "domain move fail"); - rc = ERROR_FAIL; - goto out; - } - - dom_path = libxl__xs_get_dompath(gc, *domid); + dom_path = libxl__xs_get_dompath(gc, domid); if (!dom_path) { rc = ERROR_FAIL; goto out; @@ -701,12 +568,12 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config, vm_path = GCSPRINTF("/vm/%s", uuid_string); if (!vm_path) { - LOGD(ERROR, *domid, "cannot allocate create paths"); + LOGD(ERROR, domid, "cannot allocate create paths"); rc = ERROR_FAIL; goto out; } - libxl_path = libxl__xs_libxl_path(gc, *domid); + libxl_path = libxl__xs_libxl_path(gc, domid); if (!libxl_path) { rc = ERROR_FAIL; goto out; @@ -717,10 +584,10 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config, roperm[0].id = 0; roperm[0].perms = XS_PERM_NONE; - roperm[1].id = *domid; + roperm[1].id = domid; roperm[1].perms = XS_PERM_READ; - rwperm[0].id = *domid; + rwperm[0].id = domid; rwperm[0].perms = XS_PERM_NONE; retry_transaction: @@ -738,7 +605,7 @@ retry_transaction: noperm, ARRAY_SIZE(noperm)); xs_write(ctx->xsh, t, GCSPRINTF("%s/vm", dom_path), vm_path, strlen(vm_path)); - rc = libxl__domain_rename(gc, *domid, 0, info->name, t); + rc = libxl__domain_rename(gc, domid, 0, info->name, t); if (rc) goto out; @@ -815,7 +682,7 @@ retry_transaction: vm_list = libxl_list_vm(ctx, &nb_vm); if (!vm_list) { - LOGD(ERROR, *domid, "cannot get number of running guests"); + LOGD(ERROR, domid, "cannot get number of running guests"); rc = ERROR_FAIL; goto out; } @@ -839,7 +706,7 @@ retry_transaction: t = 0; goto retry_transaction; } - LOGED(ERROR, *domid, "domain creation ""xenstore transaction commit failed"); + LOGED(ERROR, domid, "domain creation ""xenstore transaction commit failed"); rc = ERROR_FAIL; goto out; } @@ -851,6 +718,155 @@ retry_transaction: return rc; } +int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config, + libxl__domain_build_state *state, + uint32_t *domid, bool soft_reset) +{ + libxl_ctx *ctx = libxl__gc_owner(gc); + int ret, rc; + + /* convenience aliases */ + libxl_domain_create_info *info = &d_config->c_info; + libxl_domain_build_info *b_info = &d_config->b_info; + + assert(soft_reset || *domid == INVALID_DOMID); + + if (!soft_reset) { + struct xen_domctl_createdomain create = { + .ssidref = info->ssidref, + .max_vcpus = b_info->max_vcpus, + .max_evtchn_port = b_info->event_channels, + .max_grant_frames = b_info->max_grant_frames, + .max_maptrack_frames = b_info->max_maptrack_frames, + }; + + if (info->type != LIBXL_DOMAIN_TYPE_PV) { + create.flags |= XEN_DOMCTL_CDF_hvm; + create.flags |= + libxl_defbool_val(info->hap) ? XEN_DOMCTL_CDF_hap : 0; + create.flags |= + libxl_defbool_val(info->oos) ? 0 : XEN_DOMCTL_CDF_oos_off; + } + + assert(info->passthrough != LIBXL_PASSTHROUGH_DEFAULT); + LOG(DETAIL, "passthrough: %s", + libxl_passthrough_to_string(info->passthrough)); + + if (info->passthrough != LIBXL_PASSTHROUGH_DISABLED) + create.flags |= XEN_DOMCTL_CDF_iommu; + + if (info->passthrough == LIBXL_PASSTHROUGH_SYNC_PT) + create.iommu_opts |= XEN_DOMCTL_IOMMU_no_sharept; + + /* Ultimately, handle is an array of 16 uint8_t, same as uuid */ + libxl_uuid_copy(ctx, (libxl_uuid *)&create.handle, &info->uuid); + + ret = libxl__arch_domain_prepare_config(gc, d_config, &create); + if (ret < 0) { + LOGED(ERROR, *domid, "fail to get domain config"); + rc = ERROR_FAIL; + goto out; + } + + for (;;) { + uint32_t local_domid; + bool recent; + + if (info->domid == RANDOM_DOMID) { + uint16_t v; + + ret = libxl__random_bytes(gc, (void *)&v, sizeof(v)); + if (ret < 0) + break; + + v &= DOMID_MASK; + if (!libxl_domid_valid_guest(v)) + continue; + + local_domid = v; + } else { + local_domid = info->domid; /* May not be valid */ + } + + ret = xc_domain_create(ctx->xch, &local_domid, &create); + if (ret < 0) { + /* + * If we generated a random domid and creation failed + * because that domid already exists then simply try + * again. + */ + if (errno == EEXIST && info->domid == RANDOM_DOMID) + continue; + + LOGED(ERROR, local_domid, "domain creation fail"); + rc = ERROR_FAIL; + goto out; + } + + /* A new domain now exists */ + *domid = local_domid; + + rc = libxl__is_domid_recent(gc, local_domid, &recent); + if (rc) + goto out; + + /* The domid is not recent, so we're done */ + if (!recent) + break; + + /* + * If the domid was specified then there's no point in + * trying again. + */ + if (libxl_domid_valid_guest(info->domid)) { + LOGED(ERROR, local_domid, "domain id recently used"); + rc = ERROR_FAIL; + goto out; + } + + /* + * The domain is recent and so cannot be used. Clear domid + * here since, if xc_domain_destroy() fails below there is + * little point calling it again in the error path. + */ + *domid = INVALID_DOMID; + + ret = xc_domain_destroy(ctx->xch, local_domid); + if (ret < 0) { + LOGED(ERROR, local_domid, "domain destroy fail"); + rc = ERROR_FAIL; + goto out; + } + + /* The domain was successfully destroyed, so we can try again */ + } + + rc = libxl__arch_domain_save_config(gc, d_config, state, &create); + if (rc < 0) + goto out; + } + + /* + * If soft_reset is set the the domid will have been valid on entry. + * If it was not set then xc_domain_create() should have assigned a + * valid value. Either way, if we reach this point, domid should be + * valid. + */ + assert(libxl_domid_valid_guest(*domid)); + + ret = xc_cpupool_movedomain(ctx->xch, info->poolid, *domid); + if (ret < 0) { + LOGED(ERROR, *domid, "domain move fail"); + rc = ERROR_FAIL; + goto out; + } + + rc = libxl__domain_make_xs_entries(gc, d_config, state, *domid); + +out: + return rc; +} + static int store_libxl_entry(libxl__gc *gc, uint32_t domid, libxl_domain_build_info *b_info) { @@ -1172,16 +1188,32 @@ static void initiate_domain_create(libxl__egc *egc, ret = libxl__domain_config_setdefault(gc,d_config,domid); if (ret) goto error_out; - ret = libxl__domain_make(gc, d_config, &dcs->build_state, &domid, - dcs->soft_reset); - if (ret) { - LOGD(ERROR, domid, "cannot make domain: %d", ret); + if ( !d_config->dm_restore_file ) + { + ret = libxl__domain_make(gc, d_config, &dcs->build_state, &domid, + dcs->soft_reset); dcs->guest_domid = domid; + + if (ret) { + LOGD(ERROR, domid, "cannot make domain: %d", ret); + ret = ERROR_FAIL; + goto error_out; + } + } else if ( dcs->guest_domid != INVALID_DOMID ) { + domid = dcs->guest_domid; + + ret = libxl__domain_make_xs_entries(gc, d_config, &dcs->build_state, domid); + if (ret) { + LOGD(ERROR, domid, "cannot make domain: %d", ret); + ret = ERROR_FAIL; + goto error_out; + } + } else { + LOGD(ERROR, domid, "cannot make domain"); ret = ERROR_FAIL; goto error_out; } - dcs->guest_domid = domid; dcs->sdss.dm.guest_domid = 0; /* means we haven't spawned */ /* post-4.13 todo: move these next bits of defaulting to @@ -1217,7 +1249,7 @@ static void initiate_domain_create(libxl__egc *egc, if (ret) goto error_out; - if (restore_fd >= 0 || dcs->soft_reset) { + if (restore_fd >= 0 || dcs->soft_reset || d_config->dm_restore_file) { LOGD(DEBUG, domid, "restoring, not running bootloader"); domcreate_bootloader_done(egc, &dcs->bl, 0); } else { @@ -1293,7 +1325,16 @@ static void domcreate_bootloader_done(libxl__egc *egc, dcs->sdss.dm.callback = domcreate_devmodel_started; dcs->sdss.callback = domcreate_devmodel_started; - if (restore_fd < 0 && !dcs->soft_reset) { + if (restore_fd < 0 && !dcs->soft_reset && !d_config->dm_restore_file) { + rc = libxl__domain_build(gc, d_config, domid, state); + domcreate_rebuild_done(egc, dcs, rc); + return; + } + + if ( d_config->dm_restore_file ) { + dcs->srs.dcs = dcs; + dcs->srs.ao = ao; + state->forked_vm = true; rc = libxl__domain_build(gc, d_config, domid, state); domcreate_rebuild_done(egc, dcs, rc); return; @@ -1491,6 +1532,7 @@ static void domcreate_rebuild_done(libxl__egc *egc, /* convenience aliases */ const uint32_t domid = dcs->guest_domid; libxl_domain_config *const d_config = dcs->guest_config; + libxl__domain_build_state *const state = &dcs->build_state; if (ret) { LOGD(ERROR, domid, "cannot (re-)build domain: %d", ret); @@ -1498,6 +1540,9 @@ static void domcreate_rebuild_done(libxl__egc *egc, goto error_out; } + if ( d_config->dm_restore_file ) + state->saved_state = GCSPRINTF("%s", d_config->dm_restore_file); + store_libxl_entry(gc, domid, &d_config->b_info); libxl__multidev_begin(ao, &dcs->multidev); @@ -1886,7 +1931,7 @@ static void domain_create_cb(libxl__egc *egc, libxl__domain_create_state *dcs, int rc, uint32_t domid); -static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config, +int libxl__do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config, uint32_t *domid, int restore_fd, int send_back_fd, const libxl_domain_restore_params *params, const libxl_asyncop_how *ao_how, @@ -1899,6 +1944,8 @@ static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config, GCNEW(cdcs); cdcs->dcs.ao = ao; cdcs->dcs.guest_config = d_config; + cdcs->dcs.guest_domid = *domid; + libxl_domain_config_init(&cdcs->dcs.guest_config_saved); libxl_domain_config_copy(ctx, &cdcs->dcs.guest_config_saved, d_config); cdcs->dcs.restore_fd = cdcs->dcs.libxc_fd = restore_fd; @@ -2143,8 +2190,8 @@ int libxl_domain_create_new(libxl_ctx *ctx, libxl_domain_config *d_config, const libxl_asyncprogress_how *aop_console_how) { unset_disk_colo_restore(d_config); - return do_domain_create(ctx, d_config, domid, -1, -1, NULL, - ao_how, aop_console_how); + return libxl__do_domain_create(ctx, d_config, domid, -1, -1, NULL, + ao_how, aop_console_how); } int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config, @@ -2160,8 +2207,8 @@ int libxl_domain_create_restore(libxl_ctx *ctx, libxl_domain_config *d_config, unset_disk_colo_restore(d_config); } - return do_domain_create(ctx, d_config, domid, restore_fd, send_back_fd, - params, ao_how, aop_console_how); + return libxl__do_domain_create(ctx, d_config, domid, restore_fd, send_back_fd, + params, ao_how, aop_console_how); } int libxl_domain_soft_reset(libxl_ctx *ctx, diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c index 3b1da90167..87ae1478cf 100644 --- a/tools/libxl/libxl_dm.c +++ b/tools/libxl/libxl_dm.c @@ -2787,7 +2787,7 @@ static void device_model_spawn_outcome(libxl__egc *egc, libxl__domain_build_state *state = dmss->build_state; - if (state->saved_state) { + if (state->saved_state && !state->forked_vm) { ret2 = unlink(state->saved_state); if (ret2) { LOGED(ERROR, dmss->guest_domid, "%s: failed to remove device-model state %s", diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index 71cb578923..3bc7117b99 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -249,9 +249,12 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid, libxl_domain_build_info *const info = &d_config->b_info; libxl_ctx *ctx = libxl__gc_owner(gc); char *xs_domid, *con_domid; - int rc; + int rc = 0; uint64_t size; + if ( state->forked_vm ) + goto skip_fork; + if (xc_domain_max_vcpus(ctx->xch, domid, info->max_vcpus) != 0) { LOG(ERROR, "Couldn't set max vcpu count"); return ERROR_FAIL; @@ -362,7 +365,6 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid, } } - rc = libxl__arch_extra_memory(gc, info, &size); if (rc < 0) { LOGE(ERROR, "Couldn't get arch extra constant memory size"); @@ -374,6 +376,11 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid, return ERROR_FAIL; } + rc = libxl__arch_domain_create(gc, d_config, domid); + if ( rc ) + goto out; + +skip_fork: xs_domid = xs_read(ctx->xsh, XBT_NULL, "/tool/xenstored/domid", NULL); state->store_domid = xs_domid ? atoi(xs_domid) : 0; free(xs_domid); @@ -385,8 +392,7 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid, state->store_port = xc_evtchn_alloc_unbound(ctx->xch, domid, state->store_domid); state->console_port = xc_evtchn_alloc_unbound(ctx->xch, domid, state->console_domid); - rc = libxl__arch_domain_create(gc, d_config, domid); - +out: return rc; } @@ -444,6 +450,9 @@ int libxl__build_post(libxl__gc *gc, uint32_t domid, char **ents; int i, rc; + if ( state->forked_vm ) + goto skip_fork; + if (info->num_vnuma_nodes && !info->num_vcpu_soft_affinity) { rc = set_vnuma_affinity(gc, domid, info); if (rc) @@ -466,6 +475,7 @@ int libxl__build_post(libxl__gc *gc, uint32_t domid, } } +skip_fork: ents = libxl__calloc(gc, 12 + (info->max_vcpus * 2) + 2, sizeof(char *)); ents[0] = "memory/static-max"; ents[1] = GCSPRINTF("%"PRId64, info->max_memkb); @@ -728,14 +738,16 @@ static int hvm_build_set_params(xc_interface *handle, uint32_t domid, libxl_domain_build_info *info, int store_evtchn, unsigned long *store_mfn, int console_evtchn, unsigned long *console_mfn, - domid_t store_domid, domid_t console_domid) + domid_t store_domid, domid_t console_domid, + bool forked_vm) { struct hvm_info_table *va_hvm; uint8_t *va_map, sum; uint64_t str_mfn, cons_mfn; int i; - if (info->type == LIBXL_DOMAIN_TYPE_HVM) { + if ( info->type == LIBXL_DOMAIN_TYPE_HVM && !forked_vm ) + { va_map = xc_map_foreign_range(handle, domid, XC_PAGE_SIZE, PROT_READ | PROT_WRITE, HVM_INFO_PFN); @@ -1051,6 +1063,23 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, struct xc_dom_image *dom = NULL; bool device_model = info->type == LIBXL_DOMAIN_TYPE_HVM ? true : false; + if ( state->forked_vm ) + { + rc = hvm_build_set_params(ctx->xch, domid, info, state->store_port, + &state->store_mfn, state->console_port, + &state->console_mfn, state->store_domid, + state->console_domid, state->forked_vm); + + if ( rc ) + return rc; + + return xc_dom_gnttab_seed(ctx->xch, domid, true, + state->console_mfn, + state->store_mfn, + state->console_domid, + state->store_domid); + } + xc_dom_loginit(ctx->xch); /* @@ -1175,7 +1204,7 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid, rc = hvm_build_set_params(ctx->xch, domid, info, state->store_port, &state->store_mfn, state->console_port, &state->console_mfn, state->store_domid, - state->console_domid); + state->console_domid, false); if (rc != 0) { LOG(ERROR, "hvm build set params failed"); goto out; diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 43e5885d1e..6d1e20589d 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -1374,6 +1374,7 @@ typedef struct { char *saved_state; int dm_monitor_fd; + bool forked_vm; libxl__file_reference pv_kernel; libxl__file_reference pv_ramdisk; @@ -4813,6 +4814,12 @@ _hidden int libxl__domain_pvcontrol(libxl__egc *egc, /* Check whether a domid is recent */ int libxl__is_domid_recent(libxl__gc *gc, uint32_t domid, bool *recent); +_hidden int libxl__do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config, + uint32_t *domid, int restore_fd, int send_back_fd, + const libxl_domain_restore_params *params, + const libxl_asyncop_how *ao_how, + const libxl_asyncprogress_how *aop_console_how); + #endif /* diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl index d0d431614f..55c282b1cd 100644 --- a/tools/libxl/libxl_types.idl +++ b/tools/libxl/libxl_types.idl @@ -957,6 +957,7 @@ libxl_domain_config = Struct("domain_config", [ ("on_watchdog", libxl_action_on_shutdown), ("on_crash", libxl_action_on_shutdown), ("on_soft_reset", libxl_action_on_shutdown), + ("dm_restore_file", string, {'const': True}), ], dir=DIR_IN) libxl_diskinfo = Struct("diskinfo", [ diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c index f8bc828e62..f4312411fc 100644 --- a/tools/libxl/libxl_x86.c +++ b/tools/libxl/libxl_x86.c @@ -2,6 +2,7 @@ #include "libxl_arch.h" #include +#include int libxl__arch_domain_prepare_config(libxl__gc *gc, libxl_domain_config *d_config, @@ -842,6 +843,46 @@ int libxl__arch_passthrough_mode_setdefault(libxl__gc *gc, return rc; } +/* + * The parent domain is expected to be created with default settings for + * - max_evtch_port + * - max_grant_frames + * - max_maptrack_frames + */ +int libxl_domain_fork_vm(libxl_ctx *ctx, uint32_t pdomid, uint32_t max_vcpus, uint32_t *domid) +{ + int rc; + struct xen_domctl_createdomain create = {0}; + create.flags |= XEN_DOMCTL_CDF_hvm; + create.flags |= XEN_DOMCTL_CDF_hap; + create.flags |= XEN_DOMCTL_CDF_oos_off; + create.arch.emulation_flags = (XEN_X86_EMU_ALL & ~XEN_X86_EMU_VPCI); + create.ssidref = SECINITSID_DOMU; + create.max_vcpus = max_vcpus; + create.max_evtchn_port = 1023; + create.max_grant_frames = LIBXL_MAX_GRANT_FRAMES_DEFAULT; + create.max_maptrack_frames = LIBXL_MAX_MAPTRACK_FRAMES_DEFAULT; + + if ( (rc = xc_domain_create(ctx->xch, domid, &create)) ) + return rc; + + if ( (rc = xc_memshr_fork(ctx->xch, pdomid, *domid)) ) + xc_domain_destroy(ctx->xch, *domid); + + return rc; +} + +int libxl_domain_fork_launch_dm(libxl_ctx *ctx, libxl_domain_config *d_config, + uint32_t domid, + const libxl_asyncprogress_how *aop_console_how) +{ + return libxl__do_domain_create(ctx, d_config, &domid, -1, -1, 0, 0, aop_console_how); +} + +int libxl_domain_fork_reset(libxl_ctx *ctx, uint32_t domid) +{ + return xc_memshr_fork_reset(ctx->xch, domid); +} /* * Local variables: diff --git a/tools/xl/Makefile b/tools/xl/Makefile index af4912e67a..073222233b 100644 --- a/tools/xl/Makefile +++ b/tools/xl/Makefile @@ -15,7 +15,7 @@ LDFLAGS += $(PTHREAD_LDFLAGS) CFLAGS_XL += $(CFLAGS_libxenlight) CFLAGS_XL += -Wshadow -XL_OBJS-$(CONFIG_X86) = xl_psr.o +XL_OBJS-$(CONFIG_X86) = xl_psr.o xl_forkvm.o XL_OBJS = xl.o xl_cmdtable.o xl_sxp.o xl_utils.o $(XL_OBJS-y) XL_OBJS += xl_parse.o xl_cpupool.o xl_flask.o XL_OBJS += xl_vtpm.o xl_block.o xl_nic.o xl_usb.o diff --git a/tools/xl/xl.h b/tools/xl/xl.h index 06569c6c4a..1105c34b15 100644 --- a/tools/xl/xl.h +++ b/tools/xl/xl.h @@ -31,6 +31,7 @@ struct cmd_spec { }; struct domain_create { + uint32_t ddomid; /* fork launch dm for this domid */ int debug; int daemonize; int monitor; /* handle guest reboots etc */ @@ -45,6 +46,7 @@ struct domain_create { const char *config_file; char *extra_config; /* extra config string */ const char *restore_file; + const char *dm_restore_file; char *colo_proxy_script; bool userspace_colo_proxy; int migrate_fd; /* -1 means none */ @@ -128,6 +130,8 @@ int main_pciassignable_remove(int argc, char **argv); int main_pciassignable_list(int argc, char **argv); #ifndef LIBXL_HAVE_NO_SUSPEND_RESUME int main_restore(int argc, char **argv); +int main_fork_launch_dm(int argc, char **argv); +int main_fork_reset(int argc, char **argv); int main_migrate_receive(int argc, char **argv); int main_save(int argc, char **argv); int main_migrate(int argc, char **argv); @@ -212,6 +216,7 @@ int main_psr_cat_cbm_set(int argc, char **argv); int main_psr_cat_show(int argc, char **argv); int main_psr_mba_set(int argc, char **argv); int main_psr_mba_show(int argc, char **argv); +int main_fork_vm(int argc, char **argv); #endif int main_qemu_monitor_command(int argc, char **argv); diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c index 08335394e5..ef634abf32 100644 --- a/tools/xl/xl_cmdtable.c +++ b/tools/xl/xl_cmdtable.c @@ -187,6 +187,21 @@ struct cmd_spec cmd_table[] = { "Restore a domain from a saved state", "- for internal use only", }, +#if defined(__i386__) || defined(__x86_64__) + { "fork-vm", + &main_fork_vm, 0, 1, + "Fork a domain from the running parent domid. Experimental. Most config settings must match parent.", + "[options] ", + "-h Print this help.\n" + "-C Use config file for VM fork.\n" + "-Q Use qemu save file for VM fork.\n" + "--launch-dm Launch device model (QEMU) for VM fork.\n" + "--fork-reset Reset VM fork.\n" + "--max-vcpus Specify max-vcpus matching the parent domain when not launching dm\n" + "-p Do not unpause fork VM after operation.\n" + "-d Enable debug messages.\n" + }, +#endif #endif { "dump-core", &main_dump_core, 0, 1, diff --git a/tools/xl/xl_forkvm.c b/tools/xl/xl_forkvm.c new file mode 100644 index 0000000000..a7ee5b4771 --- /dev/null +++ b/tools/xl/xl_forkvm.c @@ -0,0 +1,147 @@ +/* + * Copyright 2020 Intel Corporation + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as published + * by the Free Software Foundation; version 2.1 only. with the special + * exception on linking described in file LICENSE. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +#include "xl.h" +#include "xl_utils.h" +#include "xl_parse.h" + +int main_fork_vm(int argc, char **argv) +{ + int rc, debug = 0; + uint32_t domid_in = INVALID_DOMID, domid_out = INVALID_DOMID; + int launch_dm = 1; + bool reset = 0; + bool pause = 0; + const char *config_file = NULL; + const char *dm_restore_file = NULL; + uint32_t max_vcpus = 0; + + int opt; + static struct option opts[] = { + {"launch-dm", 1, 0, 'l'}, + {"fork-reset", 0, 0, 'r'}, + {"max-vcpus", 1, 0, 'm'}, + COMMON_LONG_OPTS + }; + + SWITCH_FOREACH_OPT(opt, "phdC:Q:l:rm:N:D:B:V:", opts, "fork-vm", 1) { + case 'd': + debug = 1; + break; + case 'p': + pause = 1; + break; + case 'm': + max_vcpus = atoi(optarg); + break; + case 'C': + config_file = optarg; + break; + case 'Q': + dm_restore_file = optarg; + break; + case 'l': + if ( !strcmp(optarg, "no") ) + launch_dm = 0; + if ( !strcmp(optarg, "yes") ) + launch_dm = 1; + if ( !strcmp(optarg, "late") ) + launch_dm = 2; + break; + case 'r': + reset = 1; + break; + case 'N': /* fall-through */ + case 'D': /* fall-through */ + case 'B': /* fall-through */ + case 'V': + fprintf(stderr, "Unimplemented option(s)\n"); + return EXIT_FAILURE; + } + + if (argc-optind == 1) { + domid_in = atoi(argv[optind]); + } else { + help("fork-vm"); + return EXIT_FAILURE; + } + + if (launch_dm && (!config_file || !dm_restore_file)) { + fprintf(stderr, "Currently you must provide both -C and -Q options\n"); + return EXIT_FAILURE; + } + + if (reset) { + domid_out = domid_in; + if (libxl_domain_fork_reset(ctx, domid_in) == EXIT_FAILURE) + return EXIT_FAILURE; + } + + if (launch_dm == 2 || reset) { + domid_out = domid_in; + rc = EXIT_SUCCESS; + } else { + if ( !max_vcpus ) + { + fprintf(stderr, "Currently you must parent's max_vcpu for this option\n"); + return EXIT_FAILURE; + } + + rc = libxl_domain_fork_vm(ctx, domid_in, max_vcpus, &domid_out); + } + + if (rc == EXIT_SUCCESS) { + if ( launch_dm ) { + struct domain_create dom_info; + memset(&dom_info, 0, sizeof(dom_info)); + dom_info.ddomid = domid_out; + dom_info.dm_restore_file = dm_restore_file; + dom_info.debug = debug; + dom_info.paused = pause; + dom_info.config_file = config_file; + dom_info.migrate_fd = -1; + dom_info.send_back_fd = -1; + rc = create_domain(&dom_info) < 0 ? EXIT_FAILURE : EXIT_SUCCESS; + } else if ( !pause ) + rc = libxl_domain_unpause(ctx, domid_out, NULL); + } + + if (rc == EXIT_SUCCESS) + fprintf(stderr, "fork-vm command successfully returned domid: %u\n", domid_out); + else if ( domid_out != INVALID_DOMID ) + libxl_domain_destroy(ctx, domid_out, 0); + + return rc; +} + +/* + * Local variables: + * mode: C + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */ diff --git a/tools/xl/xl_vmcontrol.c b/tools/xl/xl_vmcontrol.c index 2e2d427492..782fbbc24b 100644 --- a/tools/xl/xl_vmcontrol.c +++ b/tools/xl/xl_vmcontrol.c @@ -676,6 +676,12 @@ int create_domain(struct domain_create *dom_info) int restoring = (restore_file || (migrate_fd >= 0)); +#if defined(__i386__) || defined(__x86_64__) + /* VM forking */ + uint32_t ddomid = dom_info->ddomid; // launch dm for this domain iff set + const char *dm_restore_file = dom_info->dm_restore_file; +#endif + libxl_domain_config_init(&d_config); if (restoring) { @@ -926,6 +932,14 @@ start: * restore/migrate-receive it again. */ restoring = 0; +#if defined(__i386__) || defined(__x86_64__) + } else if ( ddomid ) { + d_config.dm_restore_file = dm_restore_file; + ret = libxl_domain_fork_launch_dm(ctx, &d_config, ddomid, + autoconnect_console_how); + domid = ddomid; + ddomid = INVALID_DOMID; +#endif } else if (domid_soft_reset != INVALID_DOMID) { /* Do soft reset. */ ret = libxl_domain_soft_reset(ctx, &d_config, domid_soft_reset,