From patchwork Tue Jun 21 16:59:05 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Cooper X-Patchwork-Id: 9191009 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 7488C6075E for ; Tue, 21 Jun 2016 17:01:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6278228305 for ; Tue, 21 Jun 2016 17:01:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 571EC28319; Tue, 21 Jun 2016 17:01:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E606D28305 for ; Tue, 21 Jun 2016 17:01:57 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bFP1P-0001RN-4x; Tue, 21 Jun 2016 16:59:31 +0000 Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bFP1N-0001Qe-9C for xen-devel@lists.xen.org; Tue, 21 Jun 2016 16:59:29 +0000 Received: from [193.109.254.147] by server-11.bemta-14.messagelabs.com id A7/A6-01707-07279675; Tue, 21 Jun 2016 16:59:28 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFupmkeJIrShJLcpLzFFi42JxWrohUregKDP c4OgsdoslHxezODB6HN39mymAMYo1My8pvyKBNWPmGpaCHsGKg/O1Ghjv8HYxcnJICPhLTH9z kRXEZhPQl9j94hMTiC0ioC5xugMkzsHBLKArseqnBkhYWMBN4tHMx+wgNouAqsTuN4eYQWxeA U+JCb/3MEKMlJM4f/wnM0grp4CXxJRFRSBhIaCSL93v2CBsNYlr/ZfYIVoFJU7OfMICYjMLSE gcfPECrFVCgFvib7f9BEa+WUiqZiGpWsDItIpRvTi1qCy1SNdML6koMz2jJDcxM0fX0NBELze 1uDgxPTUnMalYLzk/dxMjMJAYgGAH498JzocYJTmYlER5mZUzwoX4kvJTKjMSizPii0pzUosP McpwcChJ8MYlZoYLCRalpqdWpGXmAEMaJi3BwaMkwnsfJM1bXJCYW5yZDpE6xagoJc57NwEoI QCSyCjNg2uDxdElRlkpYV5GoEOEeApSi3IzS1DlXzGKczAqCUOM58nMK4Gb/gpoMRPQ4mX96S CLSxIRUlINjBw7XQ+17268H2l4+JZSSecCzt+LWc5fnPgrZornturEUP8PIYf/7z1v9d0+vfP UwUPNlncE53Ru+Xv2x4u/b+LOcNq69WZV7XvuFuTGcOLydc6IPRm/YsL3z3Lwld5nzKmSySvL tfX164Nfpu6RfMVZtOnQS8mte4MWpUxeG3pDQaDZQHbpvV9KLMUZiYZazEXFiQAdWcjLngIAA A== X-Env-Sender: prvs=973d899e9=Andrew.Cooper3@citrix.com X-Msg-Ref: server-5.tower-27.messagelabs.com!1466528364!49048033!3 X-Originating-IP: [66.165.176.89] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogNjYuMTY1LjE3Ni44OSA9PiAyMDMwMDc=\n, received_headers: No Received headers X-StarScan-Received: X-StarScan-Version: 8.46; banners=-,-,- X-VirusChecked: Checked Received: (qmail 5271 invoked from network); 21 Jun 2016 16:59:27 -0000 Received: from smtp.citrix.com (HELO SMTP.CITRIX.COM) (66.165.176.89) by server-5.tower-27.messagelabs.com with RC4-SHA encrypted SMTP; 21 Jun 2016 16:59:27 -0000 X-IronPort-AV: E=Sophos;i="5.26,505,1459814400"; d="scan'208";a="361898762" From: Andrew Cooper To: Xen-devel Date: Tue, 21 Jun 2016 17:59:05 +0100 Message-ID: <1466528345-22235-4-git-send-email-andrew.cooper3@citrix.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1466528345-22235-1-git-send-email-andrew.cooper3@citrix.com> References: <1466528345-22235-1-git-send-email-andrew.cooper3@citrix.com> MIME-Version: 1.0 X-DLP: MIA1 Cc: Andrew Cooper Subject: [Xen-devel] [PATCH v2 4/4] x86/boot: copy/clear sections more efficiently X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Both the trampoline copy and BSS initialise can be performed more efficiently by using 4-byte variants of the string operations. On Intel systems with ERMSB (efficient rep movsb), this is no practical difference. On all other systems, this is 4 times more efficient. Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich --- v2: * Alter spacing after rep prefix --- xen/arch/x86/boot/head.S | 9 +++++---- xen/arch/x86/xen.lds.S | 5 +++++ 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S index 0999997..85770e8 100644 --- a/xen/arch/x86/boot/head.S +++ b/xen/arch/x86/boot/head.S @@ -128,7 +128,8 @@ __start: mov $sym_phys(__bss_end),%ecx sub %edi,%ecx xor %eax,%eax - rep stosb + shr $2,%ecx + rep stosl /* Interrogate CPU extended features via CPUID. */ mov $0x80000000,%eax @@ -192,8 +193,8 @@ __start: /* Copy bootstrap trampoline to low memory, below 1MB. */ mov $sym_phys(trampoline_start),%esi - mov $trampoline_end - trampoline_start,%ecx - rep movsb + mov $((trampoline_end - trampoline_start) / 4),%ecx + rep movsl /* Jump into the relocated trampoline. */ lret @@ -205,6 +206,6 @@ reloc: ENTRY(trampoline_start) #include "trampoline.S" -GLOBAL(trampoline_end) +ENTRY(trampoline_end) #include "x86_64.S" diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S index a1678d8..d620e7a 100644 --- a/xen/arch/x86/xen.lds.S +++ b/xen/arch/x86/xen.lds.S @@ -314,3 +314,8 @@ ASSERT(IS_ALIGNED(cpu0_stack, STACK_SIZE), "cpu0_stack misaligned") ASSERT(IS_ALIGNED(__init_begin, PAGE_SIZE), "__init_begin misaligned") ASSERT(IS_ALIGNED(__init_end, PAGE_SIZE), "__init_end misaligned") + +ASSERT(IS_ALIGNED(trampoline_start, 4), "trampoline_start misaligned") +ASSERT(IS_ALIGNED(trampoline_end, 4), "trampoline_end misaligned") +ASSERT(IS_ALIGNED(__bss_start, 4), "__bss_start misaligned") +ASSERT(IS_ALIGNED(__bss_end, 4), "__bss_end misaligned")