From patchwork Tue Oct 21 10:49:50 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dheeraj Jamwal X-Patchwork-Id: 5116941 Return-Path: X-Original-To: patchwork-ltsi-dev@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id DEA82C11AC for ; Tue, 21 Oct 2014 11:30:11 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id BA47020107 for ; Tue, 21 Oct 2014 11:30:10 +0000 (UTC) Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B5F64200D9 for ; Tue, 21 Oct 2014 11:30:09 +0000 (UTC) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 120DBC19; Tue, 21 Oct 2014 11:04:53 +0000 (UTC) X-Original-To: ltsi-dev@lists.linuxfoundation.org Delivered-To: ltsi-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 21623EAA for ; Tue, 21 Oct 2014 11:04:52 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by smtp1.linuxfoundation.org (Postfix) with ESMTP id 892151FAB2 for ; Tue, 21 Oct 2014 11:04:51 +0000 (UTC) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP; 21 Oct 2014 04:04:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.04,761,1406617200"; d="scan'208";a="617795671" Received: from ubuntu-desktop.png.intel.com ([10.221.122.25]) by fmsmga002.fm.intel.com with ESMTP; 21 Oct 2014 04:04:50 -0700 From: Dheeraj Jamwal To: ltsi-dev@lists.linuxfoundation.org Date: Tue, 21 Oct 2014 18:49:50 +0800 Message-Id: <1413889294-31328-391-git-send-email-dheerajx.s.jamwal@intel.com> X-Mailer: git-send-email 1.7.9.5 In-Reply-To: <1413889294-31328-1-git-send-email-dheerajx.s.jamwal@intel.com> References: <1413889294-31328-1-git-send-email-dheerajx.s.jamwal@intel.com> X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org Subject: [LTSI-dev] [PATCH 0390/1094] drm/doc: Overview documentation for drm_mm.c X-BeenThere: ltsi-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: "A list to discuss patches, development, and other things related to the LTSI project" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ltsi-dev-bounces@lists.linuxfoundation.org Errors-To: ltsi-dev-bounces@lists.linuxfoundation.org X-Virus-Scanned: ClamAV using ClamSMTP From: Daniel Vetter kerneldoc polish will follow in the next patch. Hopefully documenting the lru scan support a bit better spurs someone to give this a shot in the ttm eviction code. At least in i915 it helped quite a lot with memory thrashing on platforms where eviction was (we've fixed that too meanwhile) fairly expensive. Reviewed-by: Alex Deucher Signed-off-by: Daniel Vetter (cherry picked from commit 93110be69616df7dcd9cc3611e94400287fc26fb) Signed-off-by: Dheeraj Jamwal --- Documentation/DocBook/drm.tmpl | 11 +++++++ drivers/gpu/drm/drm_mm.c | 67 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 78 insertions(+) diff --git a/Documentation/DocBook/drm.tmpl b/Documentation/DocBook/drm.tmpl index dd2a955..2ac018b 100644 --- a/Documentation/DocBook/drm.tmpl +++ b/Documentation/DocBook/drm.tmpl @@ -920,6 +920,17 @@ struct drm_gem_object * (*gem_prime_import)(struct drm_device *dev, PRIME Function References !Edrivers/gpu/drm/drm_prime.c + + DRM MM Range Allocator + + Overview +!Pdrivers/gpu/drm/drm_mm.c Overview + + + LRU Scan/Eviction Support +!Pdrivers/gpu/drm/drm_mm.c lru scan roaster + + diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c index d0a8e84..276a7a2 100644 --- a/drivers/gpu/drm/drm_mm.c +++ b/drivers/gpu/drm/drm_mm.c @@ -47,6 +47,45 @@ #include #include +/** + * DOC: Overview + * + * drm_mm provides a simple range allocator. The drivers are free to use the + * resource allocator from the linux core if it suits them, the upside of drm_mm + * is that it's in the DRM core. Which means that it's easier to extend for + * some of the crazier special purpose needs of gpus. + * + * The main data struct is &drm_mm, allocations are tracked in &drm_mm_node. + * Drivers are free to embed either of them into their own suitable + * datastructures. drm_mm itself will not do any allocations of its own, so if + * drivers choose not to embed nodes they need to still allocate them + * themselves. + * + * The range allocator also supports reservation of preallocated blocks. This is + * useful for taking over initial mode setting configurations from the firmware, + * where an object needs to be created which exactly matches the firmware's + * scanout target. As long as the range is still free it can be inserted anytime + * after the allocator is initialized, which helps with avoiding looped + * depencies in the driver load sequence. + * + * drm_mm maintains a stack of most recently freed holes, which of all + * simplistic datastructures seems to be a fairly decent approach to clustering + * allocations and avoiding too much fragmentation. This means free space + * searches are O(num_holes). Given that all the fancy features drm_mm supports + * something better would be fairly complex and since gfx thrashing is a fairly + * steep cliff not a real concern. Removing a node again is O(1). + * + * drm_mm supports a few features: Alignment and range restrictions can be + * supplied. Further more every &drm_mm_node has a color value (which is just an + * opaqua unsigned long) which in conjunction with a driver callback can be used + * to implement sophisticated placement restrictions. The i915 DRM driver uses + * this to implement guard pages between incompatible caching domains in the + * graphics TT. + * + * Finally iteration helpers to walk all nodes and all holes are provided as are + * some basic allocator dumpers for debugging. + */ + static struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm, unsigned long size, unsigned alignment, @@ -400,6 +439,34 @@ void drm_mm_replace_node(struct drm_mm_node *old, struct drm_mm_node *new) EXPORT_SYMBOL(drm_mm_replace_node); /** + * DOC: lru scan roaster + * + * Very often GPUs need to have continuous allocations for a given object. When + * evicting objects to make space for a new one it is therefore not most + * efficient when we simply start to select all objects from the tail of an LRU + * until there's a suitable hole: Especially for big objects or nodes that + * otherwise have special allocation constraints there's a good chance we evict + * lots of (smaller) objects unecessarily. + * + * The DRM range allocator supports this use-case through the scanning + * interfaces. First a scan operation needs to be initialized with + * drm_mm_init_scan() or drm_mm_init_scan_with_range(). The the driver adds + * objects to the roaster (probably by walking an LRU list, but this can be + * freely implemented) until a suitable hole is found or there's no further + * evitable object. + * + * The the driver must walk through all objects again in exactly the reverse + * order to restore the allocator state. Note that while the allocator is used + * in the scan mode no other operation is allowed. + * + * Finally the driver evicts all objects selected in the scan. Adding and + * removing an object is O(1), and since freeing a node is also O(1) the overall + * complexity is O(scanned_objects). So like the free stack which needs to be + * walked before a scan operation even begins this is linear in the number of + * objects. It doesn't seem to hurt badly. + */ + +/** * Initializa lru scanning. * * This simply sets up the scanning routines with the parameters for the desired