From patchwork Sun Oct 18 12:28:11 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 7430021 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 3949DBEEA4 for ; Sun, 18 Oct 2015 12:28:26 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id BE4382068A for ; Sun, 18 Oct 2015 12:28:23 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 2BCC520431 for ; Sun, 18 Oct 2015 12:28:21 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 52BF16E1D5; Sun, 18 Oct 2015 05:28:20 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [87.106.93.118]) by gabe.freedesktop.org (Postfix) with ESMTP id 23AF16E1D5; Sun, 18 Oct 2015 05:28:19 -0700 (PDT) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from nuc-i3427.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 46790999-1500048 for multiple; Sun, 18 Oct 2015 13:28:40 +0100 Received: by nuc-i3427.alporthouse.com (sSMTP sendmail emulation); Sun, 18 Oct 2015 13:28:11 +0100 Date: Sun, 18 Oct 2015 13:28:11 +0100 From: Chris Wilson To: Imre Deak Message-ID: <20151018122811.GC27143@nuc-i3427.alporthouse.com> Mail-Followup-To: Chris Wilson , Imre Deak , intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, Daniel Vetter References: <1445025355-19348-1-git-send-email-chris@chris-wilson.co.uk> <1445112199.28898.6.camel@intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1445112199.28898.6.camel@intel.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Daniel Vetter , intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Subject: Re: [Intel-gfx] [PATCH] drm: Explicitly compute the last cacheline for clflush on range X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Sat, Oct 17, 2015 at 11:03:19PM +0300, Imre Deak wrote: > On Fri, 2015-10-16 at 20:55 +0100, Chris Wilson wrote: > > Fixes regression from > > > > commit afcd950cafea6e27b739fe7772cbbeed37d05b8b > > Author: Chris Wilson > > Date: Wed Jun 10 15:58:01 2015 +0100 > > > > drm: Avoid the double clflush on the last cache line in drm_clflush_virt_range() > > > > I'm stumped. Looking at the loop we should be iterating over every cache > > line until we reach the start of the cacheline after the end of the > > virtual range. Evidence says otherwise. > > > > More bizarely, I stored the last address to be clflushed and found it to > > be equal to the start of the cacheline containing the last byte. Doubly > > purplexed. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92501 > > Testcase: gem_tiled_partial_pwrite_pread/reads > > Signed-off-by: Chris Wilson > > Cc: Imre Deak > > Cc: Daniel Vetter > > --- > > drivers/gpu/drm/drm_cache.c | 9 ++++++--- > > 1 file changed, 6 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c > > index 6743ff7dccfa..7c909bc8b68a 100644 > > --- a/drivers/gpu/drm/drm_cache.c > > +++ b/drivers/gpu/drm/drm_cache.c > > @@ -131,10 +131,13 @@ drm_clflush_virt_range(void *addr, unsigned long length) > > #if defined(CONFIG_X86) > > if (cpu_has_clflush) { > > const int size = boot_cpu_data.x86_clflush_size; > > - void *end = addr + length; > > - addr = (void *)(((unsigned long)addr) & -size); > > + void *end; > > + > > + end = (void *)(((unsigned long)addr + length - 1) & -size); > > + addr = (void *)((unsigned long)addr & -size); > > + > > mb(); > > - for (; addr < end; addr += size) > > + for (; addr <= end; addr += size) > > Hm, I can't see how could this make any difference. The old way still > looks ok to me and the new version would flush the exact same cache > lines as the old one using the same addresses (beginning of each cache > line). I couldn't spot the difference either. I am beginning to suspect it is gcc as Also fixes gem_tiled_partial_pwrite (on byt and bsw). -Chris diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c index 6743ff7..c9097b5 100644 --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -130,11 +130,11 @@ drm_clflush_virt_range(void *addr, unsigned long length) { #if defined(CONFIG_X86) if (cpu_has_clflush) { const int size = boot_cpu_data.x86_clflush_size; - void *end = addr + length; + void *end = addr + length - 1; addr = (void *)(((unsigned long)addr) & -size); mb(); - for (; addr < end; addr += size) + for (; addr <= end; addr += size) clflushopt(addr); mb(); return;