From patchwork Thu Aug 17 17:01:33 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Olaf Hering X-Patchwork-Id: 9906697 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 92EDA6024A for ; Thu, 17 Aug 2017 17:04:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7EED42785D for ; Thu, 17 Aug 2017 17:04:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7316D27F3E; Thu, 17 Aug 2017 17:04:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 9E7CD2785D for ; Thu, 17 Aug 2017 17:04:15 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1diOBD-0008IO-3Y; Thu, 17 Aug 2017 17:01:59 +0000 Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1diOBB-0008Hm-Bi for xen-devel@lists.xen.org; Thu, 17 Aug 2017 17:01:57 +0000 Received: from [85.158.137.68] by server-14.bemta-3.messagelabs.com id 17/88-01862-40CC5995; Thu, 17 Aug 2017 17:01:56 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmpgkeJIrShJLcpLzFFi42IJXDnphi7zmam RBs8WSVss+biYxYHR4+ju30wBjFGsmXlJ+RUJrBn7Vv9nLThUUvFvfwNLA+P1oC5GLg4Wgd9M Ej8fv2bsYuTgkBDIlFg2LwzCFJF48j8NpERI4CCTRGPvZPYuRk4ONgElib0HjzOCJEQEJjJKb P3zjBEkwSygIPHi+VYmEFtYwF9i9fzpLCA2i4CqxI95B9hAbF4BY4n/p/cxg9gSAvIS/TuWg9 VzCphIPH21BmyOEFDNtF+9jBMYeRcwMqxi1ChOLSpLLdI1NNdLKspMzyjJTczM0TU0MNbLTS0 uTkxPzUlMKtZLzs/dxAgMBwYg2MH48rTnIUZJDiYlUd7fs6ZECvEl5adUZiQWZ8QXleakFh9i lOHgUJLg/XJqaqSQYFFqempFWmYOMDBh0hIcPEoivLUgad7igsTc4sx0iNQpRmOODavXf2Hie DXh/zcmIZa8/LxUKXHeyyClAiClGaV5cINgEXOJUVZKmJcR6DQhnoLUotzMElT5V4ziHIxKwr w3QKbwZOaVwO17BXQKE9ApV9ongZxSkoiQkmpgnLylbO29uQ/Xxc8wbl6yZ/X/FTwz/Pa3VZT 27ldNWqfi6PU/SfvxY7HfGWxBx7xWf1Bc9oXX/bT/J776i8sOHJA9Kb+t+Umd2BHt9q3fmtPj dpcsSfhWevLm1rz281PlZvyRvvM09vd7B675NQ9mB6yIWSZi/9CeU/psTr79p0OV63yrlF0ZH ZRYijMSDbWYi4oTAUThKn6TAgAA X-Env-Sender: olaf@aepfle.de X-Msg-Ref: server-3.tower-31.messagelabs.com!1502989315!110476052!1 X-Originating-IP: [81.169.146.216] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 9.4.45; banners=-,-,- X-VirusChecked: Checked Received: (qmail 9141 invoked from network); 17 Aug 2017 17:01:55 -0000 Received: from mo4-p00-ob.smtp.rzone.de (HELO mo4-p00-ob.smtp.rzone.de) (81.169.146.216) by server-3.tower-31.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 17 Aug 2017 17:01:55 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1502989315; l=13016; s=domk; d=aepfle.de; h=References:In-Reply-To:Date:Subject:Cc:To:From; bh=7xGfApNqHfcn8WKtfrP6CQkduFE033FOngIAuPOUxRY=; b=JGclvOEeR17kHNb4oXtPnmopvDoRnLLrt0w/241fSEe+iaJj8az1g0n0lwV4WkjjOh ZRuAzRyIw59zJTLQpANj04fXVBYi+xeKGXby7L2IdgtxHvdw8RUmnRCtElwO0wClp3lt 44tt2quZljrGLPS1dh5O01m8uSObnP303RSxQ= X-RZG-AUTH: :P2EQZWCpfu+qG7CngxMFH1J+yackYocTD1iAi8x+OWi/zfN1cLnAYQz4mzReZKAqPT2tb6Nx5EL7HQvAw3tBGHnS2eB9 X-RZG-CLASS-ID: mo00 Received: from sender ([2001:a61:3430:dff:75aa:c36f:225e:fcee]) by smtp.strato.de (RZmta 41.2 AUTH) with ESMTPSA id f03737t7HH1sCwm (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA (curve secp521r1 with 521 ECDH bits, eq. 15360 bits RSA)) (Client did not present a certificate); Thu, 17 Aug 2017 19:01:54 +0200 (CEST) From: Olaf Hering To: Andrew Cooper , Ian Jackson , Wei Liu , xen-devel@lists.xen.org Date: Thu, 17 Aug 2017 19:01:33 +0200 Message-Id: <20170817170133.30939-4-olaf@aepfle.de> X-Mailer: git-send-email 2.14.0 In-Reply-To: <20170817170133.30939-1-olaf@aepfle.de> References: <20170817170133.30939-1-olaf@aepfle.de> Cc: Olaf Hering Subject: [Xen-devel] [PATCH v2 3/3] tools/libxc: use superpages during restore of HVM guest X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP During creating of a HVM domU meminit_hvm() tries to map superpages. After save/restore or migration this mapping is lost, everything is allocated in single pages. This causes a performance degradition after migration. Add neccessary code to preallocate a superpage for the chunk of pfns that is received. In case a pfn was not populated on the sending side it must be freed on the receiving side to avoid over-allocation. The existing code for x86_pv is moved unmodified into its own file. Signed-off-by: Olaf Hering --- tools/libxc/xc_sr_common.h | 15 +++ tools/libxc/xc_sr_restore.c | 70 +------------- tools/libxc/xc_sr_restore_x86_hvm.c | 180 ++++++++++++++++++++++++++++++++++++ tools/libxc/xc_sr_restore_x86_pv.c | 72 ++++++++++++++- 4 files changed, 267 insertions(+), 70 deletions(-) diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h index 5d78f461af..26c45fdd6d 100644 --- a/tools/libxc/xc_sr_common.h +++ b/tools/libxc/xc_sr_common.h @@ -139,6 +139,16 @@ struct xc_sr_restore_ops */ int (*setup)(struct xc_sr_context *ctx); + /** + * Populate PFNs + * + * Given a set of pfns, obtain memory from Xen to fill the physmap for the + * unpopulated subset. + */ + int (*populate_pfns)(struct xc_sr_context *ctx, unsigned count, + const xen_pfn_t *original_pfns, const uint32_t *types); + + /** * Process an individual record from the stream. The caller shall take * care of processing common records (e.g. END, PAGE_DATA). @@ -336,6 +346,11 @@ struct xc_sr_context /* HVM context blob. */ void *context; size_t contextsz; + + /* Bitmap of currently allocated PFNs during restore. */ + struct xc_sr_bitmap attempted_1g; + struct xc_sr_bitmap attempted_2m; + struct xc_sr_bitmap allocated_pfns; } restore; }; } x86_hvm; diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c index d53948e1a6..1f9fe25b8f 100644 --- a/tools/libxc/xc_sr_restore.c +++ b/tools/libxc/xc_sr_restore.c @@ -68,74 +68,6 @@ static int read_headers(struct xc_sr_context *ctx) return 0; } -/* - * Given a set of pfns, obtain memory from Xen to fill the physmap for the - * unpopulated subset. If types is NULL, no page type checking is performed - * and all unpopulated pfns are populated. - */ -int populate_pfns(struct xc_sr_context *ctx, unsigned count, - const xen_pfn_t *original_pfns, const uint32_t *types) -{ - xc_interface *xch = ctx->xch; - xen_pfn_t *mfns = malloc(count * sizeof(*mfns)), - *pfns = malloc(count * sizeof(*pfns)); - unsigned i, nr_pfns = 0; - int rc = -1; - - if ( !mfns || !pfns ) - { - ERROR("Failed to allocate %zu bytes for populating the physmap", - 2 * count * sizeof(*mfns)); - goto err; - } - - for ( i = 0; i < count; ++i ) - { - if ( (!types || (types && - (types[i] != XEN_DOMCTL_PFINFO_XTAB && - types[i] != XEN_DOMCTL_PFINFO_BROKEN))) && - !pfn_is_populated(ctx, original_pfns[i]) ) - { - rc = pfn_set_populated(ctx, original_pfns[i]); - if ( rc ) - goto err; - pfns[nr_pfns] = mfns[nr_pfns] = original_pfns[i]; - ++nr_pfns; - } - } - - if ( nr_pfns ) - { - rc = xc_domain_populate_physmap_exact( - xch, ctx->domid, nr_pfns, 0, 0, mfns); - if ( rc ) - { - PERROR("Failed to populate physmap"); - goto err; - } - - for ( i = 0; i < nr_pfns; ++i ) - { - if ( mfns[i] == INVALID_MFN ) - { - ERROR("Populate physmap failed for pfn %u", i); - rc = -1; - goto err; - } - - ctx->restore.ops.set_gfn(ctx, pfns[i], mfns[i]); - } - } - - rc = 0; - - err: - free(pfns); - free(mfns); - - return rc; -} - /* * Given a list of pfns, their types, and a block of page data from the * stream, populate and record their types, map the relevant subset and copy @@ -161,7 +93,7 @@ static int process_page_data(struct xc_sr_context *ctx, unsigned count, goto err; } - rc = populate_pfns(ctx, count, pfns, types); + rc = ctx->restore.ops.populate_pfns(ctx, count, pfns, types); if ( rc ) { ERROR("Failed to populate pfns for batch of %u pages", count); diff --git a/tools/libxc/xc_sr_restore_x86_hvm.c b/tools/libxc/xc_sr_restore_x86_hvm.c index 1dca85354a..60454148db 100644 --- a/tools/libxc/xc_sr_restore_x86_hvm.c +++ b/tools/libxc/xc_sr_restore_x86_hvm.c @@ -135,6 +135,8 @@ static int x86_hvm_localise_page(struct xc_sr_context *ctx, static int x86_hvm_setup(struct xc_sr_context *ctx) { xc_interface *xch = ctx->xch; + struct xc_sr_bitmap *bm; + unsigned long bits; if ( ctx->restore.guest_type != DHDR_TYPE_X86_HVM ) { @@ -149,7 +151,30 @@ static int x86_hvm_setup(struct xc_sr_context *ctx) return -1; } + bm = &ctx->x86_hvm.restore.attempted_1g; + bits = (ctx->restore.p2m_size >> SUPERPAGE_1GB_SHIFT) + 1; + if ( xc_sr_bitmap_resize(bm, bits) == false ) + goto out; + + bm = &ctx->x86_hvm.restore.attempted_2m; + bits = (ctx->restore.p2m_size >> SUPERPAGE_2MB_SHIFT) + 1; + if ( xc_sr_bitmap_resize(bm, bits) == false ) + goto out; + + bm = &ctx->x86_hvm.restore.allocated_pfns; + bits = ctx->restore.p2m_size + 1; + if ( xc_sr_bitmap_resize(bm, bits) == false ) + goto out; + + /* No superpage in 1st 2MB due to VGA hole */ + xc_sr_set_bit(0, &ctx->x86_hvm.restore.attempted_1g); + xc_sr_set_bit(0, &ctx->x86_hvm.restore.attempted_2m); + return 0; + +out: + ERROR("Unable to allocate memory for pfn bitmaps"); + return -1; } /* @@ -224,10 +249,164 @@ static int x86_hvm_stream_complete(struct xc_sr_context *ctx) static int x86_hvm_cleanup(struct xc_sr_context *ctx) { free(ctx->x86_hvm.restore.context); + xc_sr_bitmap_free(&ctx->x86_hvm.restore.attempted_1g); + xc_sr_bitmap_free(&ctx->x86_hvm.restore.attempted_2m); + xc_sr_bitmap_free(&ctx->x86_hvm.restore.allocated_pfns); return 0; } +/* + * Set a pfn as allocated, expanding the tracking structures if needed. + */ +static int pfn_set_allocated(struct xc_sr_context *ctx, xen_pfn_t pfn) +{ + xc_interface *xch = ctx->xch; + + if ( !xc_sr_set_bit(pfn, &ctx->x86_hvm.restore.allocated_pfns) ) + { + ERROR("Failed to realloc allocated_pfns bitmap"); + errno = ENOMEM; + return -1; + } + return 0; +} + +/* + * Attempt to allocate a superpage where the pfn resides. + */ +static int x86_hvm_allocate_pfn(struct xc_sr_context *ctx, xen_pfn_t pfn) +{ + xc_interface *xch = ctx->xch; + bool success = false; + int rc = -1, done; + unsigned int order; + unsigned long i; + unsigned long stat_1g = 0, stat_2m = 0, stat_4k = 0; + unsigned long idx_1g, idx_2m; + unsigned long count; + xen_pfn_t base_pfn = 0, extnt; + + if (xc_sr_test_bit(pfn, &ctx->x86_hvm.restore.allocated_pfns)) + return 0; + + idx_1g = pfn >> SUPERPAGE_1GB_SHIFT; + idx_2m = pfn >> SUPERPAGE_2MB_SHIFT; + if (!xc_sr_bitmap_resize(&ctx->x86_hvm.restore.attempted_1g, idx_1g)) + { + PERROR("Failed to realloc attempted_1g"); + return -1; + } + if (!xc_sr_bitmap_resize(&ctx->x86_hvm.restore.attempted_2m, idx_2m)) + { + PERROR("Failed to realloc attempted_2m"); + return -1; + } + DPRINTF("idx_1g %lu idx_2m %lu\n", idx_1g, idx_2m); + if (!xc_sr_test_and_set_bit(idx_1g, &ctx->x86_hvm.restore.attempted_1g)) { + order = SUPERPAGE_1GB_SHIFT; + count = 1UL << order; + base_pfn = (pfn >> order) << order; + extnt = base_pfn; + done = xc_domain_populate_physmap(xch, ctx->domid, 1, order, 0, &extnt); + DPRINTF("1G base_pfn %" PRI_xen_pfn " done %d\n", base_pfn, done); + if (done > 0) { + struct xc_sr_bitmap *bm = &ctx->x86_hvm.restore.attempted_2m; + success = true; + stat_1g = done; + for (i = 0; i < (count >> SUPERPAGE_2MB_SHIFT); i++) + xc_sr_set_bit((base_pfn >> SUPERPAGE_2MB_SHIFT) + i, bm); + } + } + + if (!xc_sr_test_and_set_bit(idx_2m, &ctx->x86_hvm.restore.attempted_2m)) { + order = SUPERPAGE_2MB_SHIFT; + count = 1UL << order; + base_pfn = (pfn >> order) << order; + extnt = base_pfn; + done = xc_domain_populate_physmap(xch, ctx->domid, 1, order, 0, &extnt); + DPRINTF("2M base_pfn %" PRI_xen_pfn " done %d\n", base_pfn, done); + if (done > 0) { + success = true; + stat_2m = done; + } + } + if (success == false) { + count = 1; + extnt = base_pfn = pfn; + done = xc_domain_populate_physmap(xch, ctx->domid, count, 0, 0, &extnt); + if (done > 0) { + success = true; + stat_4k = count; + } + } + DPRINTF("count %lu 1G %lu 2M %lu 4k %lu\n", count, stat_1g, stat_2m, stat_4k); + if (success == true) { + do { + count--; + rc = pfn_set_allocated(ctx, base_pfn + count); + if (rc) + break; + } while (count); + } + return rc; +} + +static int x86_hvm_populate_pfns(struct xc_sr_context *ctx, unsigned count, + const xen_pfn_t *original_pfns, + const uint32_t *types) +{ + xc_interface *xch = ctx->xch; + xen_pfn_t min_pfn = original_pfns[0], max_pfn = original_pfns[0]; + unsigned i; + int rc = -1; + + for ( i = 0; i < count; ++i ) + { + if (original_pfns[i] < min_pfn) + min_pfn = original_pfns[i]; + if (original_pfns[i] > max_pfn) + max_pfn = original_pfns[i]; + if ( (types[i] != XEN_DOMCTL_PFINFO_XTAB && + types[i] != XEN_DOMCTL_PFINFO_BROKEN) && + !pfn_is_populated(ctx, original_pfns[i]) ) + { + rc = x86_hvm_allocate_pfn(ctx, original_pfns[i]); + if ( rc ) + goto err; + rc = pfn_set_populated(ctx, original_pfns[i]); + if ( rc ) + goto err; + } + } + + while (min_pfn < max_pfn) + { + if (!xc_sr_bitmap_resize(&ctx->x86_hvm.restore.allocated_pfns, min_pfn)) + { + PERROR("Failed to realloc allocated_pfns %" PRI_xen_pfn, min_pfn); + goto err; + } + if (!pfn_is_populated(ctx, min_pfn) && + xc_sr_test_and_clear_bit(min_pfn, &ctx->x86_hvm.restore.allocated_pfns)) { + xen_pfn_t pfn = min_pfn; + rc = xc_domain_decrease_reservation_exact(xch, ctx->domid, 1, 0, &pfn); + if ( rc ) + { + PERROR("Failed to release pfn %" PRI_xen_pfn, min_pfn); + goto err; + } + } + min_pfn++; + } + + rc = 0; + + err: + return rc; +} + + struct xc_sr_restore_ops restore_ops_x86_hvm = { .pfn_is_valid = x86_hvm_pfn_is_valid, @@ -236,6 +415,7 @@ struct xc_sr_restore_ops restore_ops_x86_hvm = .set_page_type = x86_hvm_set_page_type, .localise_page = x86_hvm_localise_page, .setup = x86_hvm_setup, + .populate_pfns = x86_hvm_populate_pfns, .process_record = x86_hvm_process_record, .stream_complete = x86_hvm_stream_complete, .cleanup = x86_hvm_cleanup, diff --git a/tools/libxc/xc_sr_restore_x86_pv.c b/tools/libxc/xc_sr_restore_x86_pv.c index 50e25c162c..87957559bc 100644 --- a/tools/libxc/xc_sr_restore_x86_pv.c +++ b/tools/libxc/xc_sr_restore_x86_pv.c @@ -936,6 +936,75 @@ static void x86_pv_set_gfn(struct xc_sr_context *ctx, xen_pfn_t pfn, ((uint32_t *)ctx->x86_pv.p2m)[pfn] = mfn; } +/* + * Given a set of pfns, obtain memory from Xen to fill the physmap for the + * unpopulated subset. If types is NULL, no page type checking is performed + * and all unpopulated pfns are populated. + */ +static int x86_pv_populate_pfns(struct xc_sr_context *ctx, unsigned count, + const xen_pfn_t *original_pfns, + const uint32_t *types) +{ + xc_interface *xch = ctx->xch; + xen_pfn_t *mfns = malloc(count * sizeof(*mfns)), + *pfns = malloc(count * sizeof(*pfns)); + unsigned i, nr_pfns = 0; + int rc = -1; + + if ( !mfns || !pfns ) + { + ERROR("Failed to allocate %zu bytes for populating the physmap", + 2 * count * sizeof(*mfns)); + goto err; + } + + for ( i = 0; i < count; ++i ) + { + if ( (!types || (types && + (types[i] != XEN_DOMCTL_PFINFO_XTAB && + types[i] != XEN_DOMCTL_PFINFO_BROKEN))) && + !pfn_is_populated(ctx, original_pfns[i]) ) + { + rc = pfn_set_populated(ctx, original_pfns[i]); + if ( rc ) + goto err; + pfns[nr_pfns] = mfns[nr_pfns] = original_pfns[i]; + ++nr_pfns; + } + } + + if ( nr_pfns ) + { + rc = xc_domain_populate_physmap_exact( + xch, ctx->domid, nr_pfns, 0, 0, mfns); + if ( rc ) + { + PERROR("Failed to populate physmap"); + goto err; + } + + for ( i = 0; i < nr_pfns; ++i ) + { + if ( mfns[i] == INVALID_MFN ) + { + ERROR("Populate physmap failed for pfn %u", i); + rc = -1; + goto err; + } + + ctx->restore.ops.set_gfn(ctx, pfns[i], mfns[i]); + } + } + + rc = 0; + + err: + free(pfns); + free(mfns); + + return rc; +} + /* * restore_ops function. Convert pfns back to mfns in pagetables. Possibly * needs to populate new frames if a PTE is found referring to a frame which @@ -980,7 +1049,7 @@ static int x86_pv_localise_page(struct xc_sr_context *ctx, } } - if ( to_populate && populate_pfns(ctx, to_populate, pfns, NULL) ) + if ( to_populate && x86_pv_populate_pfns(ctx, to_populate, pfns, NULL) ) return -1; for ( i = 0; i < (PAGE_SIZE / sizeof(uint64_t)); ++i ) @@ -1160,6 +1229,7 @@ struct xc_sr_restore_ops restore_ops_x86_pv = .set_gfn = x86_pv_set_gfn, .localise_page = x86_pv_localise_page, .setup = x86_pv_setup, + .populate_pfns = x86_pv_populate_pfns, .process_record = x86_pv_process_record, .stream_complete = x86_pv_stream_complete, .cleanup = x86_pv_cleanup,