From patchwork Tue Mar 27 10:26:15 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: kevin.rogovin@intel.com
X-Patchwork-Id: 10309699
Return-Path: <intel-gfx-bounces@lists.freedesktop.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	09D1E600F6 for <patchwork-intel-gfx@patchwork.kernel.org>;
	Tue, 27 Mar 2018 10:26:48 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E8A3229BE6
	for <patchwork-intel-gfx@patchwork.kernel.org>;
	Tue, 27 Mar 2018 10:26:47 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id DD6D629BE8; Tue, 27 Mar 2018 10:26:47 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED
	autolearn=ham version=3.3.1
Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177])
	(using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 4DF3529BE6
	for <patchwork-intel-gfx@patchwork.kernel.org>;
	Tue, 27 Mar 2018 10:26:47 +0000 (UTC)
Received: from gabe.freedesktop.org (localhost [127.0.0.1])
	by gabe.freedesktop.org (Postfix) with ESMTP id 62DAB6E5DA;
	Tue, 27 Mar 2018 10:26:45 +0000 (UTC)
X-Original-To: intel-gfx@lists.freedesktop.org
Delivered-To: intel-gfx@lists.freedesktop.org
Received: from mga12.intel.com (mga12.intel.com [192.55.52.136])
	by gabe.freedesktop.org (Postfix) with ESMTPS id 046366E5DA
	for <intel-gfx@lists.freedesktop.org>;
	Tue, 27 Mar 2018 10:26:43 +0000 (UTC)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga003.jf.intel.com ([10.7.209.27])
	by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
	27 Mar 2018 03:26:43 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.48,367,1517904000"; d="scan'208";a="38466876"
Received: from sinekoff-mobl1.ger.corp.intel.com (HELO
	LittleBigTrouble.ger.corp.intel.com) ([10.252.34.146])
	by orsmga003.jf.intel.com with ESMTP; 27 Mar 2018 03:26:24 -0700
From: kevin.rogovin@intel.com
To: intel-gfx@lists.freedesktop.org, abdiel.janulgue@linux.intel.com,
	joonas.lahtinen@linux.intel.com
Date: Tue, 27 Mar 2018 13:26:15 +0300
Message-Id: <1522146379-9358-2-git-send-email-kevin.rogovin@intel.com>
X-Mailer: git-send-email 2.7.4
In-Reply-To: <1522146379-9358-1-git-send-email-kevin.rogovin@intel.com>
References: <1522146379-9358-1-git-send-email-kevin.rogovin@intel.com>
Subject: [Intel-gfx] [PATCH v3 1/5] i915.rst: Narration overview on GEM +
	minor reorder to improve narration
X-BeenThere: intel-gfx@lists.freedesktop.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Intel graphics driver community testing & development
	<intel-gfx.lists.freedesktop.org>
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
	<mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
Cc: Kevin Rogovin <kevin.rogovin@intel.com>
MIME-Version: 1.0
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
X-Virus-Scanned: ClamAV using ClamSMTP

From: Kevin Rogovin <kevin.rogovin@intel.com>

Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
---
 Documentation/gpu/i915.rst      | 129 +++++++++++++++++++++++++++++++++-------
 drivers/gpu/drm/i915/i915_vma.h |  10 +++-
 2 files changed, 113 insertions(+), 26 deletions(-)

diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index 41dc881..ed8e08d 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -249,6 +249,112 @@ Memory Management and Command Submission
 This sections covers all things related to the GEM implementation in the
 i915 driver.
 
+Intel GPU Basics
+----------------
+
+An Intel GPU has multiple engines. There are several engine types.
+The user-space value `I915_EXEC_DEFAULT` is an alias to the user
+space value `I915_EXEC_RENDER`.
+
+- RCS engine is for rendering 3D and performing compute, this is named `I915_EXEC_RENDER` in user space.
+- BCS is a blitting (copy) engine, this is named `I915_EXEC_BLT` in user space.
+- VCS is a video encode and decode engine, this is named `I915_EXEC_BSD` in user space
+- VECS is video enhancement engine, this is named `I915_EXEC_VEBOX` in user space.
+
+The Intel GPU family is a familiy of integrated GPU's using Unified
+Memory Access. For having the GPU "do work", user space will feed the
+GPU batch buffers via one of the ioctls `DRM_IOCTL_I915_GEM_EXECBUFFER2`
+or `DRM_IOCTL_I915_GEM_EXECBUFFER2_WR` (the ioctl `DRM_IOCTL_I915_GEM_EXECBUFFER`
+is deprecated). Most such batchbuffers will instruct the GPU to perform
+work (for example rendering) and that work needs memory from which to
+read and memory to which to write. All memory is encapsulated within GEM
+buffer objects (usually created with the ioctl `DRM_IOCTL_I915_GEM_CREATE`).
+An ioctl providing a batchbuffer for the GPU to create will also list
+all GEM buffer objects that the batchbuffer reads and/or writes. For
+implementation details of memory management see
+`GEM BO Management Implementation Details`_.
+
+A GPU pipeline (mostly strongly so for the RCS engine) has a great deal
+of state which is to be programmed by user space via the contents of a
+batchbuffer. Starting in Gen6 (SandyBridge), hardware contexts are
+supported. A hardware context encapsulates GPU pipeline state and other
+portions of GPU state and it is much more efficient for the GPU to load
+a hardware context instead of re-submitting commands in a batchbuffer to
+the GPU to restore state. In addition, using hardware contexts provides
+much better isolation between user space clients. The ioctl
+`DRM_IOCTL_I915_GEM_CONTEXT_CREATE` is used by user space to create a
+hardware context which is identified by a 32-bit integer. The
+non-deprecated ioctls to submit batchbuffer work can pass that ID (in
+the lower bits of drm_i915_gem_execbuffer2::rsvd1) to identify what HW
+context to use with the command. When the kernel submits the batchbuffer
+to be executed by the GPU it will also instruct the GPU to load the HW
+context prior to executing the contents of a batchbuffer.
+
+The GPU has its own memory management and address space. The kernel
+driver maintains the memory translation table for the GPU. For older
+GPUs (i.e. those before Gen8), there is a single global such translation
+table, a global Graphics Translation Table (GTT). For newer generation
+GPUs each hardware context has its own translation table, called
+Per-Process Graphics Translation Table (PPGTT). Of important note, is
+that although PPGTT is named per-process it is actually per hardware
+context. When user space submits a batchbuffer, the kernel walks the
+list of GEM buffer objects used by the batchbuffer and guarantees that
+not only is the memory of each such GEM buffer object resident but it
+is also present in the (PP)GTT. If the GEM buffer object is not yet
+placed in the (PP)GTT, then it is given an address. Two consequences
+of this are: the kernel needs to edit the batchbuffer submitted to
+write the correct value of the GPU address when a GEM BO is assigned a
+GPU address and the kernel might evict a different GEM BO from the
+(PP)GTT to make address room for a GEM BO.
+
+Consequently, the ioctls submitting a batchbuffer for execution also
+include a list of all locations within buffers that refer to
+GPU-addresses so that the kernel can edit the buffer correctly. This
+process is dubbed relocation. The ioctls allow user space to provide to
+the kernel a presumed offset for each GEM buffer object used in a
+batchbuffer. If the kernel sees that the address provided by user space
+is correct, then it skips performing relocation for that GEM buffer
+object. In addition, the kernel provides to what addresses the kernel
+relocates each GEM buffer object.
+
+There is also an interface for user space to directly specify the
+address location of GEM BO's, the feature soft-pinning and made active
+within an execbuffer2 ioctl with `EXEC_OBJECT_PINNED` bit up. If
+user-space also specifies `I915_EXEC_NO_RELOC`, then the kernel is to
+not execute any relocation and user-space manages the address space for
+its PPGTT itself. The advantage of user space handling address space is
+that then the kernel does far less work and user space can safely assume
+that GEM buffer object's location in GPU address space do not change.
+
+GEM BO Management Implementation Details
+----------------------------------------
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_vma.h
+   :doc: Virtual Memory Address
+
+Buffer Object Eviction
+----------------------
+
+This section documents the interface functions for evicting buffer
+objects to make space available in the virtual gpu address spaces. Note
+that this is mostly orthogonal to shrinking buffer objects caches, which
+has the goal to make main memory (shared with the gpu through the
+unified memory architecture) available.
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_evict.c
+   :internal:
+
+Buffer Object Memory Shrinking
+------------------------------
+
+This section documents the interface function for shrinking memory usage
+of buffer object caches. Shrinking is used to make main memory
+available. Note that this is mostly orthogonal to evicting buffer
+objects, which has the goal to make space in gpu virtual address spaces.
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_shrinker.c
+   :internal:
+
 Batchbuffer Parsing
 -------------------
 
@@ -312,29 +418,6 @@ Object Tiling IOCTLs
 .. kernel-doc:: drivers/gpu/drm/i915/i915_gem_tiling.c
    :doc: buffer object tiling
 
-Buffer Object Eviction
-----------------------
-
-This section documents the interface functions for evicting buffer
-objects to make space available in the virtual gpu address spaces. Note
-that this is mostly orthogonal to shrinking buffer objects caches, which
-has the goal to make main memory (shared with the gpu through the
-unified memory architecture) available.
-
-.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_evict.c
-   :internal:
-
-Buffer Object Memory Shrinking
-------------------------------
-
-This section documents the interface function for shrinking memory usage
-of buffer object caches. Shrinking is used to make main memory
-available. Note that this is mostly orthogonal to evicting buffer
-objects, which has the goal to make space in gpu virtual address spaces.
-
-.. kernel-doc:: drivers/gpu/drm/i915/i915_gem_shrinker.c
-   :internal:
-
 GuC
 ===
 
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 8c50220..0000f23 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -38,9 +38,13 @@
 enum i915_cache_level;
 
 /**
- * A VMA represents a GEM BO that is bound into an address space. Therefore, a
- * VMA's presence cannot be guaranteed before binding, or after unbinding the
- * object into/from the address space.
+ * DOC: Virtual Memory Address
+ *
+ * An `i915_vma` struct represents a GEM BO that is bound into an address
+ * space. Therefore, a VMA's presence cannot be guaranteed before binding, or
+ * after unbinding the object into/from the address space. The struct includes
+ * the bookkepping details needed for tracking it in all the lists with which
+ * it interacts.
  *
  * To make things as simple as possible (ie. no refcounting), a VMA's lifetime
  * will always be <= an objects lifetime. So object refcounting should cover us.