From patchwork Fri Jan 15 08:00:41 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Artur Harasimiuk X-Patchwork-Id: 8039101 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 671EE9F1C0 for ; Fri, 15 Jan 2016 08:01:02 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id AF69C2041F for ; Fri, 15 Jan 2016 08:00:59 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id BCA2F2041C for ; Fri, 15 Jan 2016 08:00:57 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3124E6EB00; Fri, 15 Jan 2016 00:00:56 -0800 (PST) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by gabe.freedesktop.org (Postfix) with ESMTP id 0B05B6EB00 for ; Fri, 15 Jan 2016 00:00:54 -0800 (PST) Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP; 15 Jan 2016 00:00:55 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.22,298,1449561600"; d="scan'208";a="29786577" Received: from gklv-ahlin.igk.intel.com ([10.88.112.11]) by fmsmga004.fm.intel.com with ESMTP; 15 Jan 2016 00:00:54 -0800 From: Artur Harasimiuk To: intel-gfx@lists.freedesktop.org Date: Fri, 15 Jan 2016 09:00:41 +0100 Message-Id: <1452844841-20982-1-git-send-email-artur.harasimiuk@intel.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1452681879-1645-1-git-send-email-artur.harasimiuk@intel.com> References: <1452681879-1645-1-git-send-email-artur.harasimiuk@intel.com> Subject: [Intel-gfx] [PATCH] [v2] drm/i915: Exec flag to force non IA-Coherent cache for Gen9+ X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Starting from Gen9 we can use IA-Coherent caches. Generally, coherency can be programmed using RENDER_SURFACE_STATE or BTI 255, depending if surface state model or stateless model is used. It is important to control whether IA or GPU cache coherency should be used, especially for non-LLC devices. However this control is complicated when stateless memory access model is in action. It would require dedicated ISA code depending on coherency requirement. By setting HDC_FORCE_NON_COHERENT we *Force* data port to ignore these attributes and all caches are GPU-Coherent. This register is part of HW context, however it is private and cannot be programmed from non-privileged batch buffer. Default operation mode is as programmed by workaround. When WaForceEnableNonCoherent is in place caches are GPU-Coherent and we should not change it back to IA-Coherent because this can lead to GPU hangs (as workaround description says). A new device parameter is to inform user space about kernel capability. It tells if can request to disable IA-Coherency. Exec flag is to allow UMD to decide whether IA-Coherency is not needed for submitted batch buffer. Exec flag behavior: 1. flag is not set - use system default 2. flag is set but WaForceEnableNonCoherent is a) not programmed - *Force* GPU-Coherent cache by setting HDC_FORCE_NON_COHERENT prior to bb_start and clearing after b) programmed - do nothing, GPU-Coherent is already in place v2: Ringbufer handling fixes (Chris) Moved workarounds to common place (Chris) Removed flag cleanup (Dave) Updated commit message to reflect comments (Chris,Dave) Signed-off-by: Artur Harasimiuk --- drivers/gpu/drm/i915/i915_dma.c | 4 ++++ drivers/gpu/drm/i915/i915_drv.h | 4 ++++ drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 ++++ drivers/gpu/drm/i915/intel_lrc.c | 38 ++++++++++++++++++++++++++++++ drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++ include/uapi/drm/i915_drm.h | 8 ++++++- 6 files changed, 59 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 44a896c..f735e56 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -172,6 +172,10 @@ static int i915_getparam(struct drm_device *dev, void *data, case I915_PARAM_HAS_EXEC_SOFTPIN: value = 1; break; + case I915_PARAM_HAS_EXEC_FORCE_NON_COHERENT: + value = !dev_priv->workarounds.WaForceEnableNonCoherent && + INTEL_INFO(dev)->gen >= 9; + break; default: DRM_DEBUG("Unknown parameter %d\n", param->param); return -EINVAL; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 104bd18..793be854 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1658,6 +1658,10 @@ struct i915_wa_reg { struct i915_workarounds { struct i915_wa_reg reg[I915_MAX_WA_REGS]; u32 count; + + struct { + unsigned int WaForceEnableNonCoherent:1; + }; }; struct i915_virtual_gpu { diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index d469c47..5db3806 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1400,6 +1400,10 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, if (!i915_gem_check_execbuffer(args)) return -EINVAL; + if ((args->flags & I915_EXEC_FORCE_NON_COHERENT) && + INTEL_INFO(dev)->gen < 9) + return -EINVAL; + ret = validate_exec_list(dev, exec, args->buffer_count); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index ab344e0..f37d12f 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -879,6 +879,36 @@ int intel_logical_ring_reserve_space(struct drm_i915_gem_request *request) return intel_logical_ring_begin(request, 0); } +static inline int +intel_lr_emit_force_non_coherent(struct i915_execbuffer_params *params, + struct drm_i915_gem_execbuffer2 *args, bool force) +{ + struct drm_device *dev = params->dev; + struct drm_i915_private *dev_priv = dev->dev_private; + int ret; + + if (dev_priv->workarounds.WaForceEnableNonCoherent) + return 0; + + if (args->flags & I915_EXEC_FORCE_NON_COHERENT) { + struct intel_ringbuffer *ringbuf = params->request->ringbuf; + + ret = intel_logical_ring_begin(params->request, 4); + if (ret) + return ret; + + intel_logical_ring_emit(ringbuf, MI_NOOP); + intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1)); + intel_logical_ring_emit(ringbuf, HDC_CHICKEN0.reg); + intel_logical_ring_emit(ringbuf, force ? + _MASKED_BIT_ENABLE(HDC_FORCE_NON_COHERENT) : + _MASKED_BIT_DISABLE(HDC_FORCE_NON_COHERENT)); + intel_logical_ring_advance(ringbuf); + } + + return 0; +} + /** * execlists_submission() - submit a batchbuffer for execution, Execlists style * @dev: DRM device. @@ -959,6 +989,10 @@ int intel_execlists_submission(struct i915_execbuffer_params *params, dev_priv->relative_constants_mode = instp_mode; } + ret = intel_lr_emit_force_non_coherent(params, args, true); + if (ret) + return ret; + exec_start = params->batch_obj_vm_offset + args->batch_start_offset; @@ -966,6 +1000,10 @@ int intel_execlists_submission(struct i915_execbuffer_params *params, if (ret) return ret; + ret = intel_lr_emit_force_non_coherent(params, args, false); + if (ret) + return ret; + trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags); i915_gem_execbuffer_move_to_active(vmas, params->request); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 4060acf..456421e 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -806,6 +806,7 @@ static int gen8_init_workarounds(struct intel_engine_cs *ring) * invalidation occurs during a PSD flush. */ /* WaForceEnableNonCoherent:bdw,chv */ + dev_priv->workarounds.WaForceEnableNonCoherent = 1; /* WaHdcDisableFetchWhenMasked:bdw,chv */ WA_SET_BIT_MASKED(HDC_CHICKEN0, HDC_DONOT_FETCH_MEM_WHEN_MASKED | @@ -1049,6 +1050,7 @@ static int skl_init_workarounds(struct intel_engine_cs *ring) * a TLB invalidation occurs during a PSD flush. */ /* WaForceEnableNonCoherent:skl */ + dev_priv->workarounds.WaForceEnableNonCoherent = 1; WA_SET_BIT_MASKED(HDC_CHICKEN0, HDC_FORCE_NON_COHERENT); diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index acf2102..c425e80 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -357,6 +357,7 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_GPU_RESET 35 #define I915_PARAM_HAS_RESOURCE_STREAMER 36 #define I915_PARAM_HAS_EXEC_SOFTPIN 37 +#define I915_PARAM_HAS_EXEC_FORCE_NON_COHERENT 38 typedef struct drm_i915_getparam { __s32 param; @@ -782,7 +783,12 @@ struct drm_i915_gem_execbuffer2 { */ #define I915_EXEC_RESOURCE_STREAMER (1<<15) -#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_RESOURCE_STREAMER<<1) +/** + * Tell the kernel that the batch buffer requires to disable IA-Coherency + */ +#define I915_EXEC_FORCE_NON_COHERENT (1<<16) + +#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_FORCE_NON_COHERENT<<1) #define I915_EXEC_CONTEXT_ID_MASK (0xffffffff) #define i915_execbuffer2_set_context_id(eb2, context) \