From patchwork Tue Dec 23 00:39:56 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Andy Lutomirski <luto@amacapital.net>
X-Patchwork-Id: 5529341
Return-Path: <kvm-owner@kernel.org>
X-Original-To: patchwork-kvm@patchwork.kernel.org
Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org
Received: from mail.kernel.org (mail.kernel.org [198.145.19.201])
	by patchwork2.web.kernel.org (Postfix) with ESMTP id 83E93BEEA8
	for <patchwork-kvm@patchwork.kernel.org>;
	Tue, 23 Dec 2014 00:40:40 +0000 (UTC)
Received: from mail.kernel.org (localhost [127.0.0.1])
	by mail.kernel.org (Postfix) with ESMTP id A84562017D
	for <patchwork-kvm@patchwork.kernel.org>;
	Tue, 23 Dec 2014 00:40:39 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.kernel.org (Postfix) with ESMTP id C92212011D
	for <patchwork-kvm@patchwork.kernel.org>;
	Tue, 23 Dec 2014 00:40:38 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753999AbaLWAkR (ORCPT
	<rfc822;patchwork-kvm@patchwork.kernel.org>);
	Mon, 22 Dec 2014 19:40:17 -0500
Received: from mail-oi0-f41.google.com ([209.85.218.41]:55272 "EHLO
	mail-oi0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753899AbaLWAkM (ORCPT <rfc822; kvm@vger.kernel.org>);
	Mon, 22 Dec 2014 19:40:12 -0500
Received: by mail-oi0-f41.google.com with SMTP id i138so7490343oig.0
	for <kvm@vger.kernel.org>; Mon, 22 Dec 2014 16:40:12 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20130820;
	h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
	:references:in-reply-to:references;
	bh=kWtbL6Krj7V8h3McAWzFPFyuzZK2qlezTyoYKZwGT8s=;
	b=NTjakM6IH/tfzd6Tegea0uTzXIH7vCSl3cTQAQfIiA/mMulzBcjE2TFdYkIfiUzgz9
	y2Z1vxhHdIujF6p4YEDJDIDlcJbIC4v2JtdNI0malJGmIehHfFFsX6+xlewIVgF/dfnJ
	41Qbk4aYUlwgCViCzZEBJrP9cgxWGySkeAjAjGru2oN9IJlSZseyGeqG4M/35CziH7As
	QZO/Upatxrwk5z+LqPy+Iwngku8x7ZV7wrntgG81HdsPUvTVJyiBT42WFjOih7AqGtKV
	MO+HWyimeqD6Xd+NVhnYCoU7ZQf6j6EK+PGBNVbn4n3LMNIXq1NzaJEJlFg5YQomfw4I
	+Seg==
X-Gm-Message-State: 
 ALoCoQnI7OpLdr2f2n13fo4EUDjL2nQ9dBhovXM63kyyKCBJNffeKCQyRlKHLJrWazM6fdHGa3up
X-Received: by 10.182.213.72 with SMTP id nq8mr14448449obc.11.1419295211898;
	Mon, 22 Dec 2014 16:40:11 -0800 (PST)
Received: from localhost (23-125-129-128.lightspeed.sntcca.sbcglobal.net.
	[23.125.129.128]) by mx.google.com with ESMTPSA id
	o93sm9533190oik.21.2014.12.22.16.40.10
	(version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
	Mon, 22 Dec 2014 16:40:10 -0800 (PST)
From: Andy Lutomirski <luto@amacapital.net>
To: Paolo Bonzini <pbonzini@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>, kvm list <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Andy Lutomirski <luto@amacapital.net>
Subject: [RFC 1/2] x86, vdso: Use asm volatile in __getcpu
Date: Mon, 22 Dec 2014 16:39:56 -0800
Message-Id: 
 <6a3798f5f28095773dd71cf98fcee2b8edb77f2d.1419295081.git.luto@amacapital.net>
X-Mailer: git-send-email 2.1.0
In-Reply-To: <cover.1419295081.git.luto@amacapital.net>
References: <cover.1419295081.git.luto@amacapital.net>
In-Reply-To: <cover.1419295081.git.luto@amacapital.net>
References: <cover.1419295081.git.luto@amacapital.net>
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org
X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI,
	T_RP_MATCHES_RCVD,
	UNPARSEABLE_RELAY autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

In Linux 3.18 and below, GCC hoists the lsl instructions in the
pvclock code all the way to the beginning of __vdso_clock_gettime,
slowing the non-paravirt case significantly.  For unknown reasons,
presumably related to the removal of a branch, the performance issue
is gone as of

e76b027e6408 x86,vdso: Use LSL unconditionally for vgetcpu

but I don't trust GCC enough to expect the problem to stay fixed.

There should be no correctness issue, because the __getcpu calls in
__vdso_vlock_gettime were never necessary in the first place.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
---
 arch/x86/include/asm/vgtod.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h
index e7e9682a33e9..f556c4843aa1 100644
--- a/arch/x86/include/asm/vgtod.h
+++ b/arch/x86/include/asm/vgtod.h
@@ -80,9 +80,11 @@ static inline unsigned int __getcpu(void)
 
 	/*
 	 * Load per CPU data from GDT.  LSL is faster than RDTSCP and
-	 * works on all CPUs.
+	 * works on all CPUs.  This is volatile so that it orders
+	 * correctly wrt barrier() and to keep gcc from cleverly
+	 * hoisting it out of the calling function.
 	 */
-	asm("lsl %1,%0" : "=r" (p) : "r" (__PER_CPU_SEG));
+	asm volatile ("lsl %1,%0" : "=r" (p) : "r" (__PER_CPU_SEG));
 
 	return p;
 }