From patchwork Thu Nov 30 20:14:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pasha Tatashin X-Patchwork-Id: 13475012 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95560C4167B for ; Thu, 30 Nov 2023 20:15:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2273E6B0497; Thu, 30 Nov 2023 15:15:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1D7F46B0498; Thu, 30 Nov 2023 15:15:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 078946B0499; Thu, 30 Nov 2023 15:15:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E88A36B0497 for ; Thu, 30 Nov 2023 15:15:10 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id A9E7D1A02B3 for ; Thu, 30 Nov 2023 20:15:10 +0000 (UTC) X-FDA: 81515724780.23.9DB4BA3 Received: from mail-qv1-f45.google.com (mail-qv1-f45.google.com [209.85.219.45]) by imf11.hostedemail.com (Postfix) with ESMTP id E52A74001B for ; Thu, 30 Nov 2023 20:15:08 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=RjMl5bg7; dmarc=none; spf=pass (imf11.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.219.45 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701375309; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=X5AMqGJYt6SEyyaxLyp7JmYGeHIyK2cLcP93CB0xsAo=; b=Lm1zG3D/8emfEYA888UnobYwMOwN0iob58qcMoTm66wxR72fqYmRyLdGULokeIMM8Bx6yb eA1N5PDfOah4CP/wcEuVUjh/dzo+oR+xUE//J2yy12YQnYtSOosvBANx9vQ5XeFFSFPF38 VOXv2cOqdBpnefRmEmB3lvfdx8FU1Eg= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=RjMl5bg7; dmarc=none; spf=pass (imf11.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.219.45 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701375309; a=rsa-sha256; cv=none; b=DDbSEHNWLP+G+2uUErZGYHmQXIU5HxT5JFFx6vSzELmX4UTH0G8umH6HR2gco1U/AqOV4S rbyFGh1+hTu/JAkik/1ZhlRJGe18ZElgS7GTTIpYr+bc4ylCjXIIcoLWybRVyWd1PP2ve7 NRGsNzzRR0PGFEprCeV5KtyWD4QGqI4= Received: by mail-qv1-f45.google.com with SMTP id 6a1803df08f44-67a242c232eso7869616d6.0 for ; Thu, 30 Nov 2023 12:15:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1701375308; x=1701980108; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=X5AMqGJYt6SEyyaxLyp7JmYGeHIyK2cLcP93CB0xsAo=; b=RjMl5bg7xfP6B3RFkMaQVCq6LHjVoRdwC6VEKyad2fBmKw6CVnbQedtavSdhEWdkMr gSChRdCz+uO7f8RErbHpTBlh8+652O8uXkKTQeiMMVU9LqTTM7QBWm+SHxFB9Hg/TeSX smkFtJ4x1j9V90HsNgtnc3r89NCLZAiT23/E4281iNh31eY/yE4WZ2q0DPHgBt2Dxx5p N8P188l3U8ZGt9RB6H6p4T1BwrtCKzaiAWw+KGgZ6FHeGJ+DF/O50L+k+zcWNhhMjzSA WLVjma5+jzGybIEgVr0rDmSfqholo0QzqJidjOeNhin5Q2R7iHWRRhCR3BuhSf80Rbf7 EEvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701375308; x=1701980108; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=X5AMqGJYt6SEyyaxLyp7JmYGeHIyK2cLcP93CB0xsAo=; b=BM8BrKHMBV5zH3oPwAXSATla52rD9ByWKAx9DY9P44NOqmtmvBmQqGHXOhKoQIVslO ZL3IhDL7/TMQPFPl7/PEeIfvB7voyBztWOSo/XQAKr0p4TuaINjG6jCHKIQhP+rJxXds hRXuuh5+wLSzgAyihBmut3rWv2benstLc1y4NBbqYlld6yLGAxxXV+PBNd99A//2/9jY /HJ5ihwdWH0rrOuHLF7GvB4cw5ucXccp1KUK35zHKsA8Zuqa6G0ikL4sAT7jZhp5IXsx k5ipv60EFZEv/5OH1l8WsqqlwfwG+ngJlo8uFmi5nkAWTQO7aL/HHqZjxZwbPQb16hr6 EYMA== X-Gm-Message-State: AOJu0YwLZslK8EYeNVao+D5eXi76NbuAWl2rUQTJ9Ss+HReCtvsga60H R/rgu6K0u2Eszi/B/LourM5PqQ== X-Google-Smtp-Source: AGHT+IEJXctu9WcvSzFXt39rV8QEM3y7T1pFE0WiGQYLYtGfkA61KyCWh7Z0c00L3koHxXYwUBpejQ== X-Received: by 2002:ad4:41c8:0:b0:67a:45e5:4aed with SMTP id a8-20020ad441c8000000b0067a45e54aedmr13542036qvq.21.1701375307954; Thu, 30 Nov 2023 12:15:07 -0800 (PST) Received: from soleen.c.googlers.com.com (55.87.194.35.bc.googleusercontent.com. [35.194.87.55]) by smtp.gmail.com with ESMTPSA id e1-20020a0cb441000000b0067a35608186sm795252qvf.28.2023.11.30.12.15.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Nov 2023 12:15:07 -0800 (PST) From: Pasha Tatashin To: akpm@linux-foundation.org, alim.akhtar@samsung.com, alyssa@rosenzweig.io, asahi@lists.linux.dev, baolu.lu@linux.intel.com, bhelgaas@google.com, cgroups@vger.kernel.org, corbet@lwn.net, david@redhat.com, dwmw2@infradead.org, hannes@cmpxchg.org, heiko@sntech.de, iommu@lists.linux.dev, jernej.skrabec@gmail.com, jonathanh@nvidia.com, joro@8bytes.org, krzysztof.kozlowski@linaro.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-rockchip@lists.infradead.org, linux-samsung-soc@vger.kernel.org, linux-sunxi@lists.linux.dev, linux-tegra@vger.kernel.org, lizefan.x@bytedance.com, marcan@marcan.st, mhiramat@kernel.org, m.szyprowski@samsung.com, pasha.tatashin@soleen.com, paulmck@kernel.org, rdunlap@infradead.org, robin.murphy@arm.com, samuel@sholland.org, suravee.suthikulpanit@amd.com, sven@svenpeter.dev, thierry.reding@gmail.com, tj@kernel.org, tomas.mudrunka@gmail.com, vdumpa@nvidia.com, wens@csie.org, will@kernel.org, yu-cheng.yu@intel.com Subject: [PATCH v2 00/10] IOMMU memory observability Date: Thu, 30 Nov 2023 20:14:54 +0000 Message-ID: <20231130201504.2322355-1-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.43.0.rc2.451.g8631bc7472-goog MIME-Version: 1.0 X-Rspamd-Queue-Id: E52A74001B X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: basbakegjc9pc1ti1c4ap8onk9dh8bqk X-HE-Tag: 1701375308-570748 X-HE-Meta: U2FsdGVkX1+t1TlKiHxQxejZ/bvK+YyjF/cXdmQIPcv+3olXmByYDujCv+M8gYeA0I5eCkrTertW5R5OfMoGzyzZ44FmlsrfM0dFDJM1kH9nAWjZLymu89wwdt4FGAhoAS3yo+Dlv0rfbToHderYQXjffKbQ+NkeAaxLw40fvImxLn29zYXDR1uFhmT9ssYmV7GKj/72O+BjSBdMbdLTrFOqmu+MfySuN+SEwBToaEKSzkcfen5Slc54n5jQJgM2Nl3Wu0lXFO3Il5NfhCt/zH337LSN4LK3CDygUcr8YdXeLUtLctnycyfmyQvzYqx/qOT6uiBfjK8gNDSTcNWwn3Upvg86cd9RMrs+BzwXTbCTD+byQdpc73F/YJEknaEDcuXEh7d71JLoOFM6TRC6GiAlhH1zdI/laTDsW4b0os9ZKWKtfQhtDqLFGVxBnpZy8Y/OsfZs6+A3AA4FAKLosdnEGuWfDpzcFPc9Ba8g/k1zZUBgDnj1s4HvHZAuIGaSggN6IOeIATPTPsiZaPIfrV3375wppncvNI7uT8n00w9R8QWyHBaD7bcd/g16FE3ajtQxMNWY0jjP/vdLqQmcioe7GCZT5aNONFFZ9fZ3A9yNjVimNyY8QcBAYGQQmSMceY4IylgQhsXuh04bVg+Bbf1adW8VANUBJKrTOuaTN5ifOvQSCSAThG8bBhXXfvCEOO9fMzJauth27mQZuIiIyOAgFrlXYgH31NSpTqAzM0VGzsMDB+U4BXRo8xkGV0oKXw0uOkxU9oBGAzd7Hfz26Nvw7SMABkCrQvM2Vn5gVuAM2pa+g4qjmJqLSkiJMuLgANTERo3CkHiEugRfuasjeDFkuXLDNVYXP0ggPq4UZ8mcZTHXzYUoYMP11z9CkK846w0Hr6fYNOkUCrWbpjk3XUwt4uUszSBSvRTnSjzh4OPfn1fN7WHgIsMIXaGZIVLoz5JOoGnCMlNI7Hdbp/o kWkeUj1e CZIKpQJFWYdWFZEdcUjikQec5UMWJgQ2yPOytaaP/q444Z9dYPw87hwdKHJogA+k4C/3G9GRugnO6vnCQ6m3CCxm5EcRcs0fJRBatubP8F4b5kphAnS368iIHE3D49i464FBiO4gRFK6oKyFDDQhqpKDtcIkFTUNEIwgs8Rye7Y70nwhzM668f85C1GKtmp43lKJg7hinXUvgOlvhviudOfrEUKjk+S/eixK6XhNeY7W2AK6hclYMrfV6oNJvL7T14+BAiGJ9RXRf0bBdhRhlxv0ys6FljEOQ7XAqXzrHtkEieh6WSeO+RlC19nxtNOvhXDbPb06byQmwuOxwtgGHzu5RW14889aJtpLR+/kxTMemNp46nFt0fLBG3QFncVxiPkmkuE10D6VW2gLgNzqKGJoz3A+tIq91K6bc+BifIhsAEMv8kx+7JyAFyjURXQmpYu/r6LRka8RjzExTWqi6CviMjcx/XTKjSbouGv4M0QWauR7Y+c1dK9YuCW6SZVxoJYRLjERb3Z7U3oBfnDF2HXMxdddqdO0qRMMtWYPCtoPTI4/myLqFqqK7xujvO4rUnVSCF4IAV4v+ptBLgvE/NV1WdUsnkK94qPRjDY33E1SyXY5zYLsCpfnlq+K2FFNfsPIX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Pasha Tatashin ---------------------------------------------------------------------- Changelog ---------------------------------------------------------------------- v2: - Added Reviewed-by Janne Grunau - Sync with 6.7.0-rc3, 3b47bc037bd44f142ac09848e8d3ecccc726be99 - Separated form the series patches: vhost-vdpa: account iommu allocations https://lore.kernel.org/all/20231130200447.2319543-1-pasha.tatashin@soleen.com vfio: account iommu allocations https://lore.kernel.org/all/20231130200900.2320829-1-pasha.tatashin@soleen.com as suggested by Jason Gunthorpe - Fixed SPARC build issue detected by kernel test robot - Drop the following patches as they do account iommu page tables: iommu/dma: use page allocation function provided by iommu-pages.h iommu/fsl: use page allocation function provided by iommu-pages.h iommu/iommufd: use page allocation function provided by iommu-pages.h as suggested by Robin Murphy. These patches are not related to IOMMU page tables. We might need to do a separate work to support DMA observability. - Remove support iommu/io-pgtable-arm-v7s as the 2nd level pages are under a page size, thanks Robin Murphy for pointing this out. ---------------------------------------------------------------------- Description ---------------------------------------------------------------------- IOMMU subsystem may contain state that is in gigabytes. Majority of that state is iommu page tables. Yet, there is currently, no way to observe how much memory is actually used by the iommu subsystem. This patch series solves this problem by adding both observability to all pages that are allocated by IOMMU, and also accountability, so admins can limit the amount if via cgroups. The system-wide observability is using /proc/meminfo: SecPageTables: 438176 kB Contains IOMMU and KVM memory. Per-node observability: /sys/devices/system/node/nodeN/meminfo Node N SecPageTables: 422204 kB Contains IOMMU and KVM memory memory in the given NUMA node. Per-node IOMMU only observability: /sys/devices/system/node/nodeN/vmstat nr_iommu_pages 105555 Contains number of pages IOMMU allocated in the given node. Accountability: using sec_pagetables cgroup-v2 memory.stat entry. With the change, iova_stress[1] stops as limit is reached: # ./iova_stress iova space: 0T free memory: 497G iova space: 1T free memory: 495G iova space: 2T free memory: 493G iova space: 3T free memory: 491G stops as limit is reached. This series encorporates suggestions that came from the discussion at LPC [2]. ---------------------------------------------------------------------- [1] https://github.com/soleen/iova_stress [2] https://lpc.events/event/17/contributions/1466 ---------------------------------------------------------------------- Previous versions v1: https://lore.kernel.org/all/20231128204938.1453583-1-pasha.tatashin@soleen.com ---------------------------------------------------------------------- Pasha Tatashin (10): iommu/vt-d: add wrapper functions for page allocations iommu/amd: use page allocation function provided by iommu-pages.h iommu/io-pgtable-arm: use page allocation function provided by iommu-pages.h iommu/io-pgtable-dart: use page allocation function provided by iommu-pages.h iommu/exynos: use page allocation function provided by iommu-pages.h iommu/rockchip: use page allocation function provided by iommu-pages.h iommu/sun50i: use page allocation function provided by iommu-pages.h iommu/tegra-smmu: use page allocation function provided by iommu-pages.h iommu: observability of the IOMMU allocations iommu: account IOMMU allocated memory Documentation/admin-guide/cgroup-v2.rst | 2 +- Documentation/filesystems/proc.rst | 4 +- drivers/iommu/amd/amd_iommu.h | 8 - drivers/iommu/amd/init.c | 91 +++++----- drivers/iommu/amd/io_pgtable.c | 13 +- drivers/iommu/amd/io_pgtable_v2.c | 20 +- drivers/iommu/amd/iommu.c | 13 +- drivers/iommu/exynos-iommu.c | 14 +- drivers/iommu/intel/dmar.c | 10 +- drivers/iommu/intel/iommu.c | 47 ++--- drivers/iommu/intel/iommu.h | 2 - drivers/iommu/intel/irq_remapping.c | 10 +- drivers/iommu/intel/pasid.c | 12 +- drivers/iommu/intel/svm.c | 7 +- drivers/iommu/io-pgtable-arm.c | 7 +- drivers/iommu/io-pgtable-dart.c | 37 ++-- drivers/iommu/iommu-pages.h | 231 ++++++++++++++++++++++++ drivers/iommu/rockchip-iommu.c | 14 +- drivers/iommu/sun50i-iommu.c | 7 +- drivers/iommu/tegra-smmu.c | 18 +- include/linux/mmzone.h | 5 +- mm/vmstat.c | 3 + 22 files changed, 390 insertions(+), 185 deletions(-) create mode 100644 drivers/iommu/iommu-pages.h