From patchwork Wed Apr 26 05:06:19 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexey Perevalov X-Patchwork-Id: 9700285 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 50961603F4 for ; Wed, 26 Apr 2017 05:09:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 402A7284DB for ; Wed, 26 Apr 2017 05:09:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 34D6D2857F; Wed, 26 Apr 2017 05:09:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 3BEB8284EA for ; Wed, 26 Apr 2017 05:09:35 +0000 (UTC) Received: from localhost ([::1]:52648 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d3FCo-0003ls-Gz for patchwork-qemu-devel@patchwork.kernel.org; Wed, 26 Apr 2017 01:09:34 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60699) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d3FA4-0001yA-Uw for qemu-devel@nongnu.org; Wed, 26 Apr 2017 01:06:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1d3FA0-0002un-1I for qemu-devel@nongnu.org; Wed, 26 Apr 2017 01:06:44 -0400 Received: from mailout2.w1.samsung.com ([210.118.77.12]:61479) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1d3F9z-0002uY-OX for qemu-devel@nongnu.org; Wed, 26 Apr 2017 01:06:39 -0400 Received: from eucas1p2.samsung.com (unknown [182.198.249.207]) by mailout2.w1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0OP000D0C3J04W00@mailout2.w1.samsung.com> for qemu-devel@nongnu.org; Wed, 26 Apr 2017 06:06:36 +0100 (BST) Received: from eusmges2.samsung.com (unknown [203.254.199.241]) by eucas1p2.samsung.com (KnoxPortal) with ESMTP id 20170426050635eucas1p255c80421feb9f79004576643d126e8c9~42jSPDh-K0950809508eucas1p2h; Wed, 26 Apr 2017 05:06:35 +0000 (GMT) Received: from eucas1p2.samsung.com ( [182.198.249.207]) by eusmges2.samsung.com (EUCPMTA) with SMTP id 8F.F0.04459.BDA20095; Wed, 26 Apr 2017 06:06:35 +0100 (BST) Received: from eusmgms1.samsung.com (unknown [182.198.249.179]) by eucas1p1.samsung.com (KnoxPortal) with ESMTP id 20170426050634eucas1p1971853fa140d2832d002ce8bd3c2b24d~42jRiUimX2331323313eucas1p11; Wed, 26 Apr 2017 05:06:34 +0000 (GMT) X-AuditID: cbfec7f1-f796e6d00000116b-d6-59002adbe233 Received: from eusync1.samsung.com ( [203.254.199.211]) by eusmgms1.samsung.com (EUCPMTA) with SMTP id DA.E5.17452.16B20095; Wed, 26 Apr 2017 06:08:49 +0100 (BST) Received: from aperevalov-ubuntu.rnd.samsung.ru ([106.109.129.199]) by eusync1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTPA id <0OP000MTF3IPGF20@eusync1.samsung.com>; Wed, 26 Apr 2017 06:06:34 +0100 (BST) From: Alexey Perevalov To: dgilbert@redhat.com Date: Wed, 26 Apr 2017 08:06:19 +0300 Message-id: <1493183181-21962-6-git-send-email-a.perevalov@samsung.com> X-Mailer: git-send-email 1.9.1 In-reply-to: <1493183181-21962-1-git-send-email-a.perevalov@samsung.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrPIsWRmVeSWpSXmKPExsWy7djP87q3tRgiDbZul7WYe/c8i0Xvtnvs Fic2HWS2uNL+k91iy/5v7BbHe3ewOLB5HHixkN3jybXNTB7v911l8+jbsooxgCWKyyYlNSez LLVI3y6BK+P36SesBQtdK3a0rGRtYDxo2sXIySEhYCKxaNFmdghbTOLCvfVsXYxcHEICSxkl +mfcYIdwPjNKfJrQwgTTsXnaXyaIxDJGiWe7lkK1dDNJzFqzH8jh4GATMJDYd88WpEFEQFzi 3b4GsBXMAgUSOz/tYAOxhQUcJXpOdrKA2CwCqhJPen+wgrTyCrhLtP60hdglJ3Hy2GRWEJtT wENi/fo3zCCrJASes0nsX3OHCaReQkBWYtMBZoh6F4kZcy+yQdjCEq+Ob4H6TEais+MgE0Rv O6NE985OVghnAqPEmel/oarsJU7dvMoEcSifxKRt05khFvBKdLQJQZR4SNz4/xEaEI4SHWtW gS0WEpjNKLH/UvUERpkFjAyrGEVSS4tz01OLjfSKE3OLS/PS9ZLzczcxAqP19L/jH3cwvj9h dYhRgINRiYc3wON/hBBrYllxZe4hRgkOZiUR3nANhkgh3pTEyqrUovz4otKc1OJDjNIcLEri vFynrkUICaQnlqRmp6YWpBbBZJk4OKUaGKVeCOSkb2A12Cf71O16JUvkHv1zl7TFnWQ6F9os j7PsWTwhfH+HZpRV9Md7of11hyY5bv3t57PYzuTZUc5fkrLJx10t9YQFWf/u9Smasj31tnjZ Y9Z9r5x5q7or51mymrc0Sm0Qyt9kGHonn+ctZ7jAq/pmQ91XTTZfbZzlLlQ91ZR6sGCPEktx RqKhFnNRcSIAXJSootICAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrHLMWRmVeSWpSXmKPExsVy+t/xy7qJ2gyRBg/fqFnMvXuexaJ32z12 ixObDjJbXGn/yW6xZf83dovjvTtYHNg8DrxYyO7x5NpmJo/3+66yefRtWcUYwBLlZpORmpiS WqSQmpecn5KZl26rFBripmuhpJCXmJtqqxSh6xsSpKRQlphTCuQZGaABB+cA92AlfbsEt4zf p5+wFix0rdjRspK1gfGgaRcjJ4eEgInE5ml/mSBsMYkL99azdTFycQgJLGGUeNE6gwXC6WWS WHdmPXMXIwcHm4CBxL57tiANIgLiEu/2NbCD2MwCBRJnF+9nBLGFBRwlek52soDYLAKqEk96 f7CCtPIKuEu0/rSF2CUncfLYZFYQm1PAQ2L9+jfMILYQUMmcq8tYJzDyLmBkWMUoklpanJue W2yoV5yYW1yal66XnJ+7iREYttuO/dy8g/HSxuBDjAIcjEo8vAEe/yOEWBPLiitzDzFKcDAr ifCGazBECvGmJFZWpRblxxeV5qQWH2I0BbppIrOUaHI+MKbySuINTQzNLQ2NjC0szI2MlMR5 Sz5cCRcSSE8sSc1OTS1ILYLpY+LglGpgFJnpdvdjOn9lXcmE3snvws80rGmpjRFhe/ope0vI PPPHpxZW6592uNA12UH7zeldpoW1LSZs2d08LtGpPMcOtT67t8be96GkRN3Oi6HhnYJzCyd8 7G/rmb1wsrfukRcR75KXu+V2Vpgun1eRE70152gGi6Bs4NanvDfSxRtVU7eekGvdXPpfiaU4 I9FQi7moOBEAOpwwkHECAAA= X-MTR: 20000000000000000@CPGS X-CMS-MailID: 20170426050634eucas1p1971853fa140d2832d002ce8bd3c2b24d X-Msg-Generator: CA X-Sender-IP: 182.198.249.179 X-Local-Sender: =?UTF-8?B?QWxleGV5IFBlcmV2YWxvdhtTUlItVmlydHVhbGl6YXRpb24g?= =?UTF-8?B?TGFiG+yCvOyEseyghOyekBtTZW5pb3IgRW5naW5lZXI=?= X-Global-Sender: =?UTF-8?B?QWxleGV5IFBlcmV2YWxvdhtTUlItVmlydHVhbGl6YXRpb24g?= =?UTF-8?B?TGFiG1NhbXN1bmcgRWxlY3Ryb25pY3MbU2VuaW9yIEVuZ2luZWVy?= X-Sender-Code: =?UTF-8?B?QzEwG0NJU0hRG0MxMEdEMDFHRDAxMDE1NA==?= CMS-TYPE: 201P X-HopCount: 7 X-CMS-RootMailID: 20170426050634eucas1p1971853fa140d2832d002ce8bd3c2b24d X-RootMTR: 20170426050634eucas1p1971853fa140d2832d002ce8bd3c2b24d References: <1493183181-21962-1-git-send-email-a.perevalov@samsung.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 210.118.77.12 Subject: [Qemu-devel] [PATCH V3 5/6] migration: calculate downtime on dst side X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: i.maximets@samsung.com, qemu-devel@nongnu.org, a.perevalov@samsung.com, peterx@redhat.com, f4bug@amsat.org Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP This patch provides downtime calculation per vCPU, as a summary and as a overlapped value for all vCPUs. This approach was suggested by Peter Xu, as an improvements of previous approch where QEMU kept tree with faulted page address and cpus bitmask in it. Now QEMU is keeping array with faulted page address as value and vCPU as index. It helps to find proper vCPU at UFFD_COPY time. Also it keeps list for downtime per vCPU (could be traced with page_fault_addr) For more details see comments for get_postcopy_total_downtime implementation. Downtime will not calculated if postcopy_downtime field of MigrationIncomingState wasn't initialized. Signed-off-by: Alexey Perevalov --- include/migration/migration.h | 3 ++ migration/migration.c | 103 ++++++++++++++++++++++++++++++++++++++++++ migration/postcopy-ram.c | 20 +++++++- migration/trace-events | 6 ++- 4 files changed, 130 insertions(+), 2 deletions(-) diff --git a/include/migration/migration.h b/include/migration/migration.h index b1759f7..137405b 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -139,6 +139,9 @@ void migration_incoming_state_destroy(void); * Functions to work with downtime context */ struct DowntimeContext *downtime_context_new(void); +void mark_postcopy_downtime_begin(uint64_t addr, int cpu); +void mark_postcopy_downtime_end(uint64_t addr); +uint64_t get_postcopy_total_downtime(void); /* * An outstanding page request, on the source, having been received diff --git a/migration/migration.c b/migration/migration.c index 0309c2b..b559dfe 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2150,3 +2150,106 @@ PostcopyState postcopy_state_set(PostcopyState new_state) return atomic_xchg(&incoming_postcopy_state, new_state); } +void mark_postcopy_downtime_begin(uint64_t addr, int cpu) +{ + MigrationIncomingState *mis = migration_incoming_get_current(); + DowntimeContext *dc; + if (!mis->downtime_ctx || cpu < 0) { + return; + } + dc = mis->downtime_ctx; + dc->vcpu_addr[cpu] = addr; + dc->last_begin = dc->page_fault_vcpu_time[cpu] = + qemu_clock_get_ms(QEMU_CLOCK_REALTIME); + + trace_mark_postcopy_downtime_begin(addr, dc, dc->page_fault_vcpu_time[cpu], + cpu); +} + +void mark_postcopy_downtime_end(uint64_t addr) +{ + MigrationIncomingState *mis = migration_incoming_get_current(); + DowntimeContext *dc; + int i; + bool all_vcpu_down = true; + int64_t now; + + if (!mis->downtime_ctx) { + return; + } + dc = mis->downtime_ctx; + now = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); + + /* check all vCPU down, + * QEMU has bitmap.h, but even with bitmap_and + * will be a cycle */ + for (i = 0; i < smp_cpus; i++) { + if (dc->vcpu_addr[i]) { + continue; + } + all_vcpu_down = false; + break; + } + + if (all_vcpu_down) { + dc->total_downtime += now - dc->last_begin; + } + + /* lookup cpu, to clear it */ + for (i = 0; i < smp_cpus; i++) { + uint64_t vcpu_downtime; + + if (dc->vcpu_addr[i] != addr) { + continue; + } + + vcpu_downtime = now - dc->page_fault_vcpu_time[i]; + + dc->vcpu_addr[i] = 0; + dc->vcpu_downtime[i] += vcpu_downtime; + } + + trace_mark_postcopy_downtime_end(addr, dc, dc->total_downtime); +} + +/* + * This function just provide calculated before downtime per cpu and trace it. + * Total downtime is calculated in mark_postcopy_downtime_end. + * + * + * Assume we have 3 CPU + * + * S1 E1 S1 E1 + * -----***********------------xxx***************------------------------> CPU1 + * + * S2 E2 + * ------------****************xxx---------------------------------------> CPU2 + * + * S3 E3 + * ------------------------****xxx********-------------------------------> CPU3 + * + * We have sequence S1,S2,E1,S3,S1,E2,E3,E1 + * S2,E1 - doesn't match condition due to sequence S1,S2,E1 doesn't include CPU3 + * S3,S1,E2 - sequence includes all CPUs, in this case overlap will be S1,E2 - + * it's a part of total downtime. + * S1 - here is last_begin + * Legend of the picture is following: + * * - means downtime per vCPU + * x - means overlapped downtime (total downtime) + */ +uint64_t get_postcopy_total_downtime(void) +{ + MigrationIncomingState *mis = migration_incoming_get_current(); + + if (!mis->downtime_ctx) { + return 0; + } + + if (trace_event_get_state(TRACE_DOWNTIME_PER_CPU)) { + int i; + for (i = 0; i < smp_cpus; i++) { + trace_downtime_per_cpu(i, mis->downtime_ctx->vcpu_downtime[i]); + } + } + return mis->downtime_ctx->total_downtime; +} diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index ce1ea5d..03c2be7 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -23,6 +23,7 @@ #include "migration/postcopy-ram.h" #include "sysemu/sysemu.h" #include "sysemu/balloon.h" +#include #include "qemu/error-report.h" #include "trace.h" @@ -470,6 +471,19 @@ static int ram_block_enable_notify(const char *block_name, void *host_addr, return 0; } +static int get_mem_fault_cpu_index(uint32_t pid) +{ + CPUState *cpu_iter; + + CPU_FOREACH(cpu_iter) { + if (cpu_iter->thread_id == pid) { + return cpu_iter->cpu_index; + } + } + trace_get_mem_fault_cpu_index(pid); + return -1; +} + /* * Handle faults detected by the USERFAULT markings */ @@ -547,8 +561,11 @@ static void *postcopy_ram_fault_thread(void *opaque) rb_offset &= ~(qemu_ram_pagesize(rb) - 1); trace_postcopy_ram_fault_thread_request(msg.arg.pagefault.address, qemu_ram_get_idstr(rb), - rb_offset); + rb_offset, + msg.arg.pagefault.feat.ptid); + mark_postcopy_downtime_begin((uintptr_t)(msg.arg.pagefault.address), + get_mem_fault_cpu_index(msg.arg.pagefault.feat.ptid)); /* * Send the request to the source - we want to request one * of our host page sizes (which is >= TPS) @@ -643,6 +660,7 @@ int postcopy_place_page(MigrationIncomingState *mis, void *host, void *from, return -e; } + mark_postcopy_downtime_end((uint64_t)host); trace_postcopy_place_page(host); return 0; diff --git a/migration/trace-events b/migration/trace-events index 7372ce2..19e7dc5 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -110,6 +110,9 @@ process_incoming_migration_co_end(int ret, int ps) "ret=%d postcopy-state=%d" process_incoming_migration_co_postcopy_end_main(void) "" migration_set_incoming_channel(void *ioc, const char *ioctype) "ioc=%p ioctype=%s" migration_set_outgoing_channel(void *ioc, const char *ioctype, const char *hostname) "ioc=%p ioctype=%s hostname=%s" +mark_postcopy_downtime_begin(uint64_t addr, void *dd, int64_t time, int cpu) "addr 0x%" PRIx64 " dd %p time %" PRId64 " cpu %d" +mark_postcopy_downtime_end(uint64_t addr, void *dd, int64_t time) "addr 0x%" PRIx64 " dd %p time %" PRId64 +downtime_per_cpu(int cpu_index, int64_t downtime) "downtime cpu[%d]=%" PRId64 # migration/rdma.c qemu_rdma_accept_incoming_migration(void) "" @@ -186,7 +189,7 @@ postcopy_ram_enable_notify(void) "" postcopy_ram_fault_thread_entry(void) "" postcopy_ram_fault_thread_exit(void) "" postcopy_ram_fault_thread_quit(void) "" -postcopy_ram_fault_thread_request(uint64_t hostaddr, const char *ramblock, size_t offset) "Request for HVA=%" PRIx64 " rb=%s offset=%zx" +postcopy_ram_fault_thread_request(uint64_t hostaddr, const char *ramblock, size_t offset, uint32_t pid) "Request for HVA=%" PRIx64 " rb=%s offset=%zx %u" postcopy_ram_incoming_cleanup_closeuf(void) "" postcopy_ram_incoming_cleanup_entry(void) "" postcopy_ram_incoming_cleanup_exit(void) "" @@ -195,6 +198,7 @@ save_xbzrle_page_skipping(void) "" save_xbzrle_page_overflow(void) "" ram_save_iterate_big_wait(uint64_t milliconds, int iterations) "big wait: %" PRIu64 " milliseconds, %d iterations" ram_load_complete(int ret, uint64_t seq_iter) "exit_code %d seq iteration %" PRIu64 +get_mem_fault_cpu_index(uint32_t pid) "pid %u is not vCPU" # migration/exec.c migration_exec_outgoing(const char *cmd) "cmd=%s"