From patchwork Thu Jun 27 15:47:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maarten Lankhorst X-Patchwork-Id: 13714713 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B21FEC2BD09 for ; Thu, 27 Jun 2024 15:48:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2F0B56B0089; Thu, 27 Jun 2024 11:48:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2A0226B008A; Thu, 27 Jun 2024 11:48:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1684F6B0095; Thu, 27 Jun 2024 11:48:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id EA3F06B0089 for ; Thu, 27 Jun 2024 11:48:06 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 63A4F1C1BF7 for ; Thu, 27 Jun 2024 15:48:06 +0000 (UTC) X-FDA: 82277099772.30.6FAB270 Received: from mblankhorst.nl (lankhorst.se [141.105.120.124]) by imf21.hostedemail.com (Postfix) with ESMTP id 57D431C001A for ; Thu, 27 Jun 2024 15:48:04 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=intel.com (policy=none); spf=none (imf21.hostedemail.com: domain of mlankhorst@mblankhorst.nl has no SPF policy when checking 141.105.120.124) smtp.mailfrom=mlankhorst@mblankhorst.nl ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719503271; a=rsa-sha256; cv=none; b=VMo1BSRLpRuzftUN6VY+KFk2OP5Ox73CX9+bQSVZa6EmuRiPQX+GxFwV7DfsygaS6VWJan Gfl2fzgz7qbUJdLSr4zuqyieypA27q1nGZilZxitEbd0mv4kl1Fm6WCRkl5Svit+c9P1F4 ohZQTgTh/VJ9VQvEkIfIkYBrinf/m5g= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=intel.com (policy=none); spf=none (imf21.hostedemail.com: domain of mlankhorst@mblankhorst.nl has no SPF policy when checking 141.105.120.124) smtp.mailfrom=mlankhorst@mblankhorst.nl ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719503271; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references; bh=CF8tMmq2jw37C+ryrwTdTqAlGcn92kSJXjHXT+e7MVw=; b=nqcKx/JHg3KER/1UJRtpaJn8HVWGQwboutDN5kRXNDl4AwJ4Ppx86l2Y6ZQbP8Acm3caOt +efVFqEozI/Yr6kcjm+mTwvCURAsVp5Vc2UlGaZaMAB6z+bTx6COR7ym1sxWQpE5KUf9AH hl4K37dyZUhB2H6xtClCY1TvPCNOnIk= From: Maarten Lankhorst To: intel-xe@lists.freedesktop.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Tejun Heo , Zefan Li , Johannes Weiner , Andrew Morton Cc: Friedrich Vock , cgroups@vger.kernel.org, linux-mm@kvack.org, Maarten Lankhorst Subject: [RFC PATCH 0/6] DRM resource management cgroup, try 2. Date: Thu, 27 Jun 2024 17:47:19 +0200 Message-ID: <20240627154754.74828-1-maarten.lankhorst@linux.intel.com> X-Mailer: git-send-email 2.45.2 MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 57D431C001A X-Stat-Signature: jqpucnbnjnt3k4fku1k3m3466199jssf X-Rspam-User: X-HE-Tag: 1719503284-81205 X-HE-Meta: U2FsdGVkX182CWWudGijltXYPNhEf5wxgApN7daay+wVgMzVdAYrs2aRc7zeaAN1aoHOEKVy6PdpfZ2bbRNregK8RQoFYS7Gq26A7gNsScZ51BzC7Y6bpn+4bSo0MW/inQ37XTJYhbc9Vh7ILcp4uTfGBrY+8WzsVyRwUyczzkbmrPZulM+qzjcQepCrOUxPIz69scILezuCfl046w+91rzMNsfUS+ky5taRWQgXxCSUJ3sPpJ744utANHQ4VD833SiEaCnx1CiIsavTycde5roqWV0rkTxZ7fXB85bqmq1sBqGDfKhNSMuFzv2qqVyaBKBl+b6JcAkAOuwgbRrH3RT7jvF4elmJ9HWlXlmVQLYE8ruHY946eiABZOLJ48lzHdf35X+/MzB5BlcKlInl5TBg4vXDMF8liJgmIyUcmWFN7LBz4UcaoFeIvIoRDgIq/LmHoAdIIeiEApfds1XgThFxMTykl81cq3Wba93bEQfdpJEE3TRCFdiwWvW3b1to9MIoVGeIOtNM9+EPzGsZ/dq+KnV3ewkTnpCgUD/POUgzQ2mj3KX0I3/zlt2eZszV5nF+659OePFB1FOGtEzyyi9WZFbi/YYwTIN3OeSJ3TAuEriFlLWzCCeFgsy3G9gtqhEDy/W7zNXWrLJ5pW8qBIHY5wy+ydGSmdpUn/Q48XMCH1YWBO0GvL7tWdqFrMmZyc1eL81Ik7nqOjhWH9iFGfGBOJvJ4dE8cwfdaZJvXSGNQZS/nUwe4ueYYQpIyVtYC8Citz5/Zn0L7BsIY52pYGdA2ayGMneJ6srI5NSBXZIGutd8lrpkjrKbit/auwVUaeReru2lGF719AW8crzrBcXw35Y9mR81T79vUfap/I/9R2/n7rxdD53OekRxau222wmIJ69TDB+cTsSzkjZLrij9lcryAkV0HKkL6F7co6Dj1op7w+g2AyNjjdK7uqUA2rhoPcct3XRnjcApT1I Wx7i9pTb T6caYG7RKxcRcsZxngcDBLil5oXrkKw9jZhbh7ROJ0UNjTIEto8hf8vHg52pjjTE7paKShkKNYoH65u1z7PsqqocQ4qcDyv6mAn7KRLoyvyYbvz1NTlSCoG2H2oyOUZF0jhU1+Kd/xSPFcNFdvvyfNsmZCNoqtrqDACFNGZroA+J9JFSiDFaEJSRHU5Cwe4a5vM5Z2KeeisgBpSqiAt6+PKDs6g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hey, A new version of my attempt at managing VRAM through cgroups. Even though it's called the DRM resource management cgroup, it would be trivial to rename it to devmem or whatever, since there is nothing DRM specific about it. This series allows setting limits on VRAM similar to system memory, with min/low/max limits. This allows various cgroups to have their own limits for usage. It sounds very abstract, but it can be used to prioritise the foreground application (by setting low), or hard partition memory so multiple processes sharing a single GPU use a proportional amount of memory each in a fair way, or to prevent long running compute jobs from having their memory evicted. This is a minimal proof of concept to get discussion going again. It works, but it only tracks active use of VRAM. In the ideal world, we would track it better in a way that also integrates better with the memory cgroup controller. Ideally for every VRAM allocation, we would know we could push it out to swap if needed, charging the original process not the process evicting. I'm hoping to restart the discussion, so that we can plug the holes and finally move forward. New in this version: - Complete rewrite using page_counter. - Support setting min/low/max, respected in the same way as memory cgroup. (Could be useful to add/allow high? To go over limit for temporary bindings during eviction on GART.) - Locking reworked. Fastpath should now be lockless with RCU. - Add a second implementation for AMD, to show how easy it is to make it work. (Should we completely move this to TTM instead?) - TTM now always respects min/low when evicting, bailing out with -ENOSPC instead where required. I'm hoping for some good feedback on the path forward for upstreaming. I feel this version has a lot better chance of being upstreamed than the previous. It should be a lot more scalable thanks to the usage of RCU and page_counter. Cheers, Maarten Maarten Lankhorst (6): mm/page_counter: Move calculating protection values to page_counter drm/cgroup: Add memory accounting DRM cgroup drm/ttm: Handle cgroup based eviction in TTM drm/xe: Implement cgroup for vram drm/amdgpu: Add cgroups implementation drm/xe: Hack to test with mapped pages instead of vram. Documentation/admin-guide/cgroup-v2.rst | 51 ++ Documentation/gpu/drm-compute.rst | 54 ++ drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 6 + drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 6 + drivers/gpu/drm/ttm/tests/ttm_bo_test.c | 18 +- drivers/gpu/drm/ttm/tests/ttm_resource_test.c | 2 +- drivers/gpu/drm/ttm/ttm_bo.c | 38 +- drivers/gpu/drm/ttm/ttm_resource.c | 28 +- drivers/gpu/drm/xe/xe_device.c | 4 + drivers/gpu/drm/xe/xe_device_types.h | 4 + drivers/gpu/drm/xe/xe_ttm_sys_mgr.c | 14 + drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 10 + include/drm/ttm/ttm_bo.h | 3 +- include/drm/ttm/ttm_resource.h | 16 +- include/linux/cgroup_drm.h | 115 +++ include/linux/cgroup_subsys.h | 4 + include/linux/page_counter.h | 4 + init/Kconfig | 7 + kernel/cgroup/Makefile | 1 + kernel/cgroup/drm.c | 813 ++++++++++++++++++ mm/memcontrol.c | 154 +--- mm/page_counter.c | 173 ++++ 23 files changed, 1355 insertions(+), 172 deletions(-) create mode 100644 Documentation/gpu/drm-compute.rst create mode 100644 include/linux/cgroup_drm.h create mode 100644 kernel/cgroup/drm.c