From patchwork Tue Dec 23 00:39:56 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andy Lutomirski X-Patchwork-Id: 5529341 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 83E93BEEA8 for ; Tue, 23 Dec 2014 00:40:40 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id A84562017D for ; Tue, 23 Dec 2014 00:40:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C92212011D for ; Tue, 23 Dec 2014 00:40:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753999AbaLWAkR (ORCPT ); Mon, 22 Dec 2014 19:40:17 -0500 Received: from mail-oi0-f41.google.com ([209.85.218.41]:55272 "EHLO mail-oi0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753899AbaLWAkM (ORCPT ); Mon, 22 Dec 2014 19:40:12 -0500 Received: by mail-oi0-f41.google.com with SMTP id i138so7490343oig.0 for ; Mon, 22 Dec 2014 16:40:12 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=kWtbL6Krj7V8h3McAWzFPFyuzZK2qlezTyoYKZwGT8s=; b=NTjakM6IH/tfzd6Tegea0uTzXIH7vCSl3cTQAQfIiA/mMulzBcjE2TFdYkIfiUzgz9 y2Z1vxhHdIujF6p4YEDJDIDlcJbIC4v2JtdNI0malJGmIehHfFFsX6+xlewIVgF/dfnJ 41Qbk4aYUlwgCViCzZEBJrP9cgxWGySkeAjAjGru2oN9IJlSZseyGeqG4M/35CziH7As QZO/Upatxrwk5z+LqPy+Iwngku8x7ZV7wrntgG81HdsPUvTVJyiBT42WFjOih7AqGtKV MO+HWyimeqD6Xd+NVhnYCoU7ZQf6j6EK+PGBNVbn4n3LMNIXq1NzaJEJlFg5YQomfw4I +Seg== X-Gm-Message-State: ALoCoQnI7OpLdr2f2n13fo4EUDjL2nQ9dBhovXM63kyyKCBJNffeKCQyRlKHLJrWazM6fdHGa3up X-Received: by 10.182.213.72 with SMTP id nq8mr14448449obc.11.1419295211898; Mon, 22 Dec 2014 16:40:11 -0800 (PST) Received: from localhost (23-125-129-128.lightspeed.sntcca.sbcglobal.net. [23.125.129.128]) by mx.google.com with ESMTPSA id o93sm9533190oik.21.2014.12.22.16.40.10 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 Dec 2014 16:40:10 -0800 (PST) From: Andy Lutomirski To: Paolo Bonzini , Marcelo Tosatti Cc: Gleb Natapov , kvm list , "linux-kernel@vger.kernel.org" , "xen-devel@lists.xenproject.org" , Andy Lutomirski Subject: [RFC 1/2] x86, vdso: Use asm volatile in __getcpu Date: Mon, 22 Dec 2014 16:39:56 -0800 Message-Id: <6a3798f5f28095773dd71cf98fcee2b8edb77f2d.1419295081.git.luto@amacapital.net> X-Mailer: git-send-email 2.1.0 In-Reply-To: References: In-Reply-To: References: Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In Linux 3.18 and below, GCC hoists the lsl instructions in the pvclock code all the way to the beginning of __vdso_clock_gettime, slowing the non-paravirt case significantly. For unknown reasons, presumably related to the removal of a branch, the performance issue is gone as of e76b027e6408 x86,vdso: Use LSL unconditionally for vgetcpu but I don't trust GCC enough to expect the problem to stay fixed. There should be no correctness issue, because the __getcpu calls in __vdso_vlock_gettime were never necessary in the first place. Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/vgtod.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h index e7e9682a33e9..f556c4843aa1 100644 --- a/arch/x86/include/asm/vgtod.h +++ b/arch/x86/include/asm/vgtod.h @@ -80,9 +80,11 @@ static inline unsigned int __getcpu(void) /* * Load per CPU data from GDT. LSL is faster than RDTSCP and - * works on all CPUs. + * works on all CPUs. This is volatile so that it orders + * correctly wrt barrier() and to keep gcc from cleverly + * hoisting it out of the calling function. */ - asm("lsl %1,%0" : "=r" (p) : "r" (__PER_CPU_SEG)); + asm volatile ("lsl %1,%0" : "=r" (p) : "r" (__PER_CPU_SEG)); return p; }