From patchwork Mon Oct 19 09:58:55 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 7434891 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 79E4A9F37F for ; Mon, 19 Oct 2015 09:59:07 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id AA2EF205D6 for ; Mon, 19 Oct 2015 09:59:06 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id A7F57205CB for ; Mon, 19 Oct 2015 09:59:05 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2F13D89B62; Mon, 19 Oct 2015 02:59:04 -0700 (PDT) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [87.106.93.118]) by gabe.freedesktop.org (Postfix) with ESMTP id 9638A89B62 for ; Mon, 19 Oct 2015 02:59:02 -0700 (PDT) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 46846992-1500048 for multiple; Mon, 19 Oct 2015 10:59:29 +0100 Received: by haswell.alporthouse.com (sSMTP sendmail emulation); Mon, 19 Oct 2015 10:58:57 +0100 From: Chris Wilson To: linux-kernel@vger.kernel.org Subject: [PATCH] x86: Add an explicit barrier() to clflushopt() Date: Mon, 19 Oct 2015 10:58:55 +0100 Message-Id: <1445248735-11915-1-git-send-email-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.6.1 X-Originating-IP: 78.156.65.138 X-Country: code=GB country="United Kingdom" ip=78.156.65.138 Cc: Daniel Vetter , dri-devel@lists.freedesktop.org, Ross Zwisler , "H . Peter Anvin" X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP During testing we observed that the last cacheline was not being flushed from a mb() for (addr = addr & -clflush_size; addr < end; addr += clflush_size) clflushopt(); mb() loop (where the initial addr and end were not cacheline aligned). Changing the loop from addr < end to addr <= end, or replacing the clflushopt() with clflush() both fixed the testcase. Hinting that GCC was miscompling the assembly within the loop and specifically the alternative within clflushopt() was confusing the loop optimizer. Adding a barrier() into clflushopt() is enough for GCC to dtrt, but solving why GCC is not seeing the constraints from the alternative_io() would be smarter... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92501 Testcase: gem_tiled_partial_pwrite_pread/read Signed-off-by: Chris Wilson Cc: Ross Zwisler Cc: H. Peter Anvin Cc: Imre Deak Cc: Daniel Vetter Cc: dri-devel@lists.freedesktop.org --- arch/x86/include/asm/special_insns.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 2270e41b32fd..0c7aedbf8930 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -199,6 +199,11 @@ static inline void clflushopt(volatile void *__p) ".byte 0x66; clflush %P0", X86_FEATURE_CLFLUSHOPT, "+m" (*(volatile char __force *)__p)); + /* GCC (4.9.1 and 5.2.1 at least) appears to be very confused when + * meeting this alternative() and demonstrably miscompiles loops + * iterating over clflushopts. + */ + barrier(); } static inline void clwb(volatile void *__p)