From patchwork Tue Mar 27 10:26:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: kevin.rogovin@intel.com X-Patchwork-Id: 10309699 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 09D1E600F6 for ; Tue, 27 Mar 2018 10:26:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E8A3229BE6 for ; Tue, 27 Mar 2018 10:26:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DD6D629BE8; Tue, 27 Mar 2018 10:26:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 4DF3529BE6 for ; Tue, 27 Mar 2018 10:26:47 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 62DAB6E5DA; Tue, 27 Mar 2018 10:26:45 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 046366E5DA for ; Tue, 27 Mar 2018 10:26:43 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 Mar 2018 03:26:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.48,367,1517904000"; d="scan'208";a="38466876" Received: from sinekoff-mobl1.ger.corp.intel.com (HELO LittleBigTrouble.ger.corp.intel.com) ([10.252.34.146]) by orsmga003.jf.intel.com with ESMTP; 27 Mar 2018 03:26:24 -0700 From: kevin.rogovin@intel.com To: intel-gfx@lists.freedesktop.org, abdiel.janulgue@linux.intel.com, joonas.lahtinen@linux.intel.com Date: Tue, 27 Mar 2018 13:26:15 +0300 Message-Id: <1522146379-9358-2-git-send-email-kevin.rogovin@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522146379-9358-1-git-send-email-kevin.rogovin@intel.com> References: <1522146379-9358-1-git-send-email-kevin.rogovin@intel.com> Subject: [Intel-gfx] [PATCH v3 1/5] i915.rst: Narration overview on GEM + minor reorder to improve narration X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Rogovin MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Kevin Rogovin Signed-off-by: Kevin Rogovin --- Documentation/gpu/i915.rst | 129 +++++++++++++++++++++++++++++++++------- drivers/gpu/drm/i915/i915_vma.h | 10 +++- 2 files changed, 113 insertions(+), 26 deletions(-) diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst index 41dc881..ed8e08d 100644 --- a/Documentation/gpu/i915.rst +++ b/Documentation/gpu/i915.rst @@ -249,6 +249,112 @@ Memory Management and Command Submission This sections covers all things related to the GEM implementation in the i915 driver. +Intel GPU Basics +---------------- + +An Intel GPU has multiple engines. There are several engine types. +The user-space value `I915_EXEC_DEFAULT` is an alias to the user +space value `I915_EXEC_RENDER`. + +- RCS engine is for rendering 3D and performing compute, this is named `I915_EXEC_RENDER` in user space. +- BCS is a blitting (copy) engine, this is named `I915_EXEC_BLT` in user space. +- VCS is a video encode and decode engine, this is named `I915_EXEC_BSD` in user space +- VECS is video enhancement engine, this is named `I915_EXEC_VEBOX` in user space. + +The Intel GPU family is a familiy of integrated GPU's using Unified +Memory Access. For having the GPU "do work", user space will feed the +GPU batch buffers via one of the ioctls `DRM_IOCTL_I915_GEM_EXECBUFFER2` +or `DRM_IOCTL_I915_GEM_EXECBUFFER2_WR` (the ioctl `DRM_IOCTL_I915_GEM_EXECBUFFER` +is deprecated). Most such batchbuffers will instruct the GPU to perform +work (for example rendering) and that work needs memory from which to +read and memory to which to write. All memory is encapsulated within GEM +buffer objects (usually created with the ioctl `DRM_IOCTL_I915_GEM_CREATE`). +An ioctl providing a batchbuffer for the GPU to create will also list +all GEM buffer objects that the batchbuffer reads and/or writes. For +implementation details of memory management see +`GEM BO Management Implementation Details`_. + +A GPU pipeline (mostly strongly so for the RCS engine) has a great deal +of state which is to be programmed by user space via the contents of a +batchbuffer. Starting in Gen6 (SandyBridge), hardware contexts are +supported. A hardware context encapsulates GPU pipeline state and other +portions of GPU state and it is much more efficient for the GPU to load +a hardware context instead of re-submitting commands in a batchbuffer to +the GPU to restore state. In addition, using hardware contexts provides +much better isolation between user space clients. The ioctl +`DRM_IOCTL_I915_GEM_CONTEXT_CREATE` is used by user space to create a +hardware context which is identified by a 32-bit integer. The +non-deprecated ioctls to submit batchbuffer work can pass that ID (in +the lower bits of drm_i915_gem_execbuffer2::rsvd1) to identify what HW +context to use with the command. When the kernel submits the batchbuffer +to be executed by the GPU it will also instruct the GPU to load the HW +context prior to executing the contents of a batchbuffer. + +The GPU has its own memory management and address space. The kernel +driver maintains the memory translation table for the GPU. For older +GPUs (i.e. those before Gen8), there is a single global such translation +table, a global Graphics Translation Table (GTT). For newer generation +GPUs each hardware context has its own translation table, called +Per-Process Graphics Translation Table (PPGTT). Of important note, is +that although PPGTT is named per-process it is actually per hardware +context. When user space submits a batchbuffer, the kernel walks the +list of GEM buffer objects used by the batchbuffer and guarantees that +not only is the memory of each such GEM buffer object resident but it +is also present in the (PP)GTT. If the GEM buffer object is not yet +placed in the (PP)GTT, then it is given an address. Two consequences +of this are: the kernel needs to edit the batchbuffer submitted to +write the correct value of the GPU address when a GEM BO is assigned a +GPU address and the kernel might evict a different GEM BO from the +(PP)GTT to make address room for a GEM BO. + +Consequently, the ioctls submitting a batchbuffer for execution also +include a list of all locations within buffers that refer to +GPU-addresses so that the kernel can edit the buffer correctly. This +process is dubbed relocation. The ioctls allow user space to provide to +the kernel a presumed offset for each GEM buffer object used in a +batchbuffer. If the kernel sees that the address provided by user space +is correct, then it skips performing relocation for that GEM buffer +object. In addition, the kernel provides to what addresses the kernel +relocates each GEM buffer object. + +There is also an interface for user space to directly specify the +address location of GEM BO's, the feature soft-pinning and made active +within an execbuffer2 ioctl with `EXEC_OBJECT_PINNED` bit up. If +user-space also specifies `I915_EXEC_NO_RELOC`, then the kernel is to +not execute any relocation and user-space manages the address space for +its PPGTT itself. The advantage of user space handling address space is +that then the kernel does far less work and user space can safely assume +that GEM buffer object's location in GPU address space do not change. + +GEM BO Management Implementation Details +---------------------------------------- + +.. kernel-doc:: drivers/gpu/drm/i915/i915_vma.h + :doc: Virtual Memory Address + +Buffer Object Eviction +---------------------- + +This section documents the interface functions for evicting buffer +objects to make space available in the virtual gpu address spaces. Note +that this is mostly orthogonal to shrinking buffer objects caches, which +has the goal to make main memory (shared with the gpu through the +unified memory architecture) available. + +.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_evict.c + :internal: + +Buffer Object Memory Shrinking +------------------------------ + +This section documents the interface function for shrinking memory usage +of buffer object caches. Shrinking is used to make main memory +available. Note that this is mostly orthogonal to evicting buffer +objects, which has the goal to make space in gpu virtual address spaces. + +.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_shrinker.c + :internal: + Batchbuffer Parsing ------------------- @@ -312,29 +418,6 @@ Object Tiling IOCTLs .. kernel-doc:: drivers/gpu/drm/i915/i915_gem_tiling.c :doc: buffer object tiling -Buffer Object Eviction ----------------------- - -This section documents the interface functions for evicting buffer -objects to make space available in the virtual gpu address spaces. Note -that this is mostly orthogonal to shrinking buffer objects caches, which -has the goal to make main memory (shared with the gpu through the -unified memory architecture) available. - -.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_evict.c - :internal: - -Buffer Object Memory Shrinking ------------------------------- - -This section documents the interface function for shrinking memory usage -of buffer object caches. Shrinking is used to make main memory -available. Note that this is mostly orthogonal to evicting buffer -objects, which has the goal to make space in gpu virtual address spaces. - -.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_shrinker.c - :internal: - GuC === diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index 8c50220..0000f23 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -38,9 +38,13 @@ enum i915_cache_level; /** - * A VMA represents a GEM BO that is bound into an address space. Therefore, a - * VMA's presence cannot be guaranteed before binding, or after unbinding the - * object into/from the address space. + * DOC: Virtual Memory Address + * + * An `i915_vma` struct represents a GEM BO that is bound into an address + * space. Therefore, a VMA's presence cannot be guaranteed before binding, or + * after unbinding the object into/from the address space. The struct includes + * the bookkepping details needed for tracking it in all the lists with which + * it interacts. * * To make things as simple as possible (ie. no refcounting), a VMA's lifetime * will always be <= an objects lifetime. So object refcounting should cover us.