From patchwork Fri Feb 7 16:13:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 11370731 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D71551398 for ; Fri, 7 Feb 2020 16:13:38 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C000921775 for ; Fri, 7 Feb 2020 16:13:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C000921775 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A37056EACE; Fri, 7 Feb 2020 16:13:36 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 639466E902 for ; Fri, 7 Feb 2020 16:13:35 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Feb 2020 08:13:34 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,414,1574150400"; d="scan'208";a="226515284" Received: from aabader-mobl1.ccr.corp.intel.com (HELO localhost.localdomain) ([10.252.21.249]) by fmsmga008.fm.intel.com with ESMTP; 07 Feb 2020 08:13:33 -0800 From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Fri, 7 Feb 2020 16:13:25 +0000 Message-Id: <20200207161331.23447-1-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Subject: [Intel-gfx] [RFC 0/8] Per client engine busyness X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Tvrtko Ursulin Another re-spin of the per-client engine busyness series. Highlights from this version: * Refactor of the i915_drm_client concept based on feedback from Chris. * Tracking contexts belonging to a client had a flaw where reaped contexts would stop contributing to the busyness making the counters go backwards. I was simply broken. In this version I had to track per-client stats at the same call-sites where per-context is tracked. It's a little bit more runtime cost but not too bad. * GuC support is out - needs more time than I have at the moment to properly wire it up with the latest changes. Plus there is a monotonicity issue with either the value stored in pphwsp by the GPU or a bug in my patch which also needs to be debugged. Internally we track time spent on engines for each struct intel_context. This can serve as a building block for several features from the want list: smarter scheduler decisions, getrusage(2)-like per-GEM-context functionality wanted by some customers, cgroups controller, dynamic SSEU tuning,... Externally, in sysfs, we expose time spent on GPU per client and per engine class. Sysfs interface enables us to implement a "top-like" tool for GPU tasks. Or with a "screenshot": ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ intel-gpu-top - 906/ 955 MHz; 0% RC6; 5.30 Watts; 933 irqs/s IMC reads: 4414 MiB/s IMC writes: 3805 MiB/s ENGINE BUSY MI_SEMA MI_WAIT Render/3D/0 93.46% |████████████████████████████████▋ | 0% 0% Blitter/0 0.00% | | 0% 0% Video/0 0.00% | | 0% 0% VideoEnhance/0 0.00% | | 0% 0% PID NAME Render/3D Blitter Video VideoEnhance 2733 neverball |██████▌ || || || | 2047 Xorg |███▊ || || || | 2737 glxgears |█▍ || || || | 2128 xfwm4 | || || || | 2047 Xorg | || || || | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Implementation wise we add a a bunch of files in sysfs like: # cd /sys/class/drm/card0/clients/ # tree . ├── 7 │ ├── busy │ │ ├── 0 │ │ ├── 1 │ │ ├── 2 │ │ └── 3 │ ├── name │ └── pid ├── 8 │ ├── busy │ │ ├── 0 │ │ ├── 1 │ │ ├── 2 │ │ └── 3 │ ├── name │ └── pid └── 9 ├── busy │ ├── 0 │ ├── 1 │ ├── 2 │ └── 3 ├── name └── pid Files in 'busy' directories are numbered using the engine class ABI values and they contain accumulated nanoseconds each client spent on engines of a respective class. It is stil a RFC since it misses dedicated test cases to ensure things really work as advertised. Tvrtko Ursulin (6): drm/i915: Expose list of clients in sysfs drm/i915: Update client name on context create drm/i915: Make GEM contexts track DRM clients drm/i915: Track per-context engine busyness drm/i915: Track per drm client engine class busyness drm/i915: Expose per-engine client busyness drivers/gpu/drm/i915/Makefile | 3 +- drivers/gpu/drm/i915/gem/i915_gem_context.c | 31 +- .../gpu/drm/i915/gem/i915_gem_context_types.h | 13 +- drivers/gpu/drm/i915/gt/intel_context.c | 20 + drivers/gpu/drm/i915/gt/intel_context.h | 35 ++ drivers/gpu/drm/i915/gt/intel_context_types.h | 9 + drivers/gpu/drm/i915/gt/intel_engine_cs.c | 16 +- drivers/gpu/drm/i915/gt/intel_lrc.c | 65 +++- drivers/gpu/drm/i915/i915_debugfs.c | 31 +- drivers/gpu/drm/i915/i915_drm_client.c | 350 ++++++++++++++++++ drivers/gpu/drm/i915/i915_drm_client.h | 88 +++++ drivers/gpu/drm/i915/i915_drv.h | 5 + drivers/gpu/drm/i915/i915_gem.c | 35 +- drivers/gpu/drm/i915/i915_gpu_error.c | 21 +- drivers/gpu/drm/i915/i915_sysfs.c | 8 + 15 files changed, 673 insertions(+), 57 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_drm_client.c create mode 100644 drivers/gpu/drm/i915/i915_drm_client.h