From patchwork Thu Nov 13 15:28:46 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sedat Dilek X-Patchwork-Id: 5298811 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 70CEFC11AC for ; Thu, 13 Nov 2014 15:28:59 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id B4E69201C8 for ; Thu, 13 Nov 2014 15:28:57 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id DF875200F0 for ; Thu, 13 Nov 2014 15:28:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7F3036ED02; Thu, 13 Nov 2014 07:28:52 -0800 (PST) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wg0-f43.google.com (mail-wg0-f43.google.com [74.125.82.43]) by gabe.freedesktop.org (Postfix) with ESMTP id 5F4BA6ED00 for ; Thu, 13 Nov 2014 07:28:47 -0800 (PST) Received: by mail-wg0-f43.google.com with SMTP id y10so17406426wgg.2 for ; Thu, 13 Nov 2014 07:28:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=/Wh+erIo+SPvivgmlMRlTBTUX3q8WI8HpZsRzKXNrGk=; b=hlhnTzS5NcVUxrZi7fAM3WEaiXgdDA5UisxJH+8BHzknjrs9FNqeNcc2s4+rETm0hM 62FU6c9A70KWp6693eZaZI4+uog/ysLywebFK9+n0gTBTYj2yS/q6jxCpUyliIYzugsF Aprya1lf53h0IVehWJ01CetrVSCC5g47/2RLdATo02gw5STZAgU/ita7xHEMPoaDk7Bv YkFMs9Pgbpo2Mjx5MjFjfluoWXlr5mbaNRtwqevBsoieA1tAKOeiFSKKu2n4pIXrW73V j2UOWYFoR42WDUsTreE85BlEoe55gVjJ9J0KkeJuFSxgg03w5iR8o/hwWQpAnnecmaMR z7tA== MIME-Version: 1.0 X-Received: by 10.194.58.8 with SMTP id m8mr5120764wjq.43.1415892526288; Thu, 13 Nov 2014 07:28:46 -0800 (PST) Received: by 10.217.89.195 with HTTP; Thu, 13 Nov 2014 07:28:46 -0800 (PST) In-Reply-To: References: <20141107102956.GP3250@nuc-i3427.alporthouse.com> <20141107104350.GR3250@nuc-i3427.alporthouse.com> <20141107111452.GS3250@nuc-i3427.alporthouse.com> <20141107121011.GT3250@nuc-i3427.alporthouse.com> Date: Thu, 13 Nov 2014 16:28:46 +0100 Message-ID: From: Sedat Dilek To: Chris Wilson , intel-gfx Subject: Re: [Intel-gfx] sna: Experimental support for write-combining mmaps (wc-mmap) X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: sedat.dilek@gmail.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-5.1 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi, what is the status of drm-intel-wc-mmap patchset (#2 + #3)? I have refreshed them on top of drm-intel-coherent-phys-gtt patch (#1). Playing with that againt Linux v3.18-rc4. Regards, - Sedat - From 4882abcb71b4982371a1aad038d7565887138ee5 Mon Sep 17 00:00:00 2001 From: Akash Goel Date: Thu, 23 Oct 2014 17:55:47 +0100 Subject: [PATCH 3/3] drm/i915: Support creation of unbound wc user mappings for objects This patch provides support to create write-combining virtual mappings of GEM object. It intends to provide the same funtionality of 'mmap_gtt' interface without the constraints and contention of a limited aperture space, but requires clients handles the linear to tile conversion on their own. This is for improving the CPU write operation performance, as with such mapping, writes and reads are almost 50% faster than with mmap_gtt. Similar to the GTT mmapping, unlike the regular CPU mmapping, it avoids the cache flush after update from CPU side, when object is passed onto GPU. This type of mapping is specially useful in case of sub-region update, i.e. when only a portion of the object is to be updated. Using a CPU mmap in such cases would normally incur a clflush of the whole object, and using a GTT mmapping would likely require eviction of an active object or fence and thus stall. The write-combining CPU mmap avoids both. To ensure the cache coherency, before using this mapping, the GTT domain has been reused here. This provides the required cache flush if the object is in CPU domain or synchronization against the concurrent rendering. Although the access through an uncached mmap should automatically invalidate the cache lines, this may not be true for non-temporal write instructions and also not all pages of the object may be updated at any given point of time through this mapping. Having a call to get_pages in set_to_gtt_domain function, as added in the earlier patch 'drm/i915: Broaden application of set-domain(GTT)', would guarantee the clflush and so there will be no cachelines holding the data for the object before it is accessed through this map. The drm_i915_gem_mmap structure (for the DRM_I915_GEM_MMAP_IOCTL) has been extended with a new flags field (defaulting to 0 for existent users). In order for userspace to detect the extended ioctl, a new parameter I915_PARAM_HAS_EXT_MMAP has been added for versioning the ioctl interface. v2: Fix error handling, invalid flag detection, renaming (ickle) v3: Adapt to fit "drm/i915: Make the physical object coherent with GTT" (dileks) The new mmapping is exercised by igt/gem_mmap_wc, igt/gem_concurrent_blit and igt/gem_gtt_speed. Change-Id: Ie883942f9e689525f72fe9a8d3780c3a9faa769a Signed-off-by: Akash Goel Signed-off-by: Chris Wilson Cc: Daniel Vetter --- drivers/gpu/drm/i915/i915_dma.c | 3 +++ drivers/gpu/drm/i915/i915_gem.c | 19 +++++++++++++++++++ include/uapi/drm/i915_drm.h | 9 +++++++++ 3 files changed, 31 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 9b08853..c034cfc 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1030,6 +1030,9 @@ static int i915_getparam(struct drm_device *dev, void *data, case I915_PARAM_HAS_COHERENT_PHYS_GTT: value = 1; break; + case I915_PARAM_MMAP_VERSION: + value = 1; + break; default: DRM_DEBUG("Unknown parameter %d\n", param->param); return -EINVAL; diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index fe119e1..48d0bbc 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1548,6 +1548,12 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data, struct drm_gem_object *obj; unsigned long addr; + if (args->flags & ~(I915_MMAP_WC)) + return -EINVAL; + + if (args->flags & I915_MMAP_WC && !cpu_has_pat) + return -ENODEV; + obj = drm_gem_object_lookup(dev, file, args->handle); if (obj == NULL) return -ENOENT; @@ -1563,6 +1569,19 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data, addr = vm_mmap(obj->filp, 0, args->size, PROT_READ | PROT_WRITE, MAP_SHARED, args->offset); + if (args->flags & I915_MMAP_WC) { + struct mm_struct *mm = current->mm; + struct vm_area_struct *vma; + + down_write(&mm->mmap_sem); + vma = find_vma(mm, addr); + if (vma) + vma->vm_page_prot = + pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); + else + addr = -ENOMEM; + up_write(&mm->mmap_sem); + } drm_gem_object_unreference_unlocked(obj); if (IS_ERR((void *)addr)) return addr; diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index c6b229f..f7eaefa 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -341,6 +341,7 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_WT 27 #define I915_PARAM_CMD_PARSER_VERSION 28 #define I915_PARAM_HAS_COHERENT_PHYS_GTT 29 +#define I915_PARAM_MMAP_VERSION 30 typedef struct drm_i915_getparam { int param; @@ -488,6 +489,14 @@ struct drm_i915_gem_mmap { * This is a fixed-size type for 32/64 compatibility. */ __u64 addr_ptr; + + /** + * Flags for extended behaviour. + * + * Added in version 2. + */ + __u64 flags; +#define I915_MMAP_WC 0x1 }; struct drm_i915_gem_mmap_gtt { -- 2.1.3