From patchwork Sun Jan 12 15:53:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13936452 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02015E77188 for ; Sun, 12 Jan 2025 16:15:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7231D6B0092; Sun, 12 Jan 2025 11:15:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6AABE6B0098; Sun, 12 Jan 2025 11:15:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 54AF06B0099; Sun, 12 Jan 2025 11:15:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 357786B0092 for ; Sun, 12 Jan 2025 11:15:49 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E63AF80A2F for ; Sun, 12 Jan 2025 16:15:48 +0000 (UTC) X-FDA: 82999300776.03.C436E68 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf15.hostedemail.com (Postfix) with ESMTP id 7EF0BA0017 for ; Sun, 12 Jan 2025 16:15:47 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf15.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736698547; a=rsa-sha256; cv=none; b=37q0AKKofQBR2JrydBl/28Q4/ZwIV6dPlg7n6CqSyOXf74RTow+CfJs0XGQFdXKiqaoDzd ca2vXX+bbTYqS9wInt8FJVzKr6sHiAgy5Cm1eV0js0jKzIazOt2bdSDaFhiDna7gjINw1K kzGEKZc4OgqR9/VDzDsDBHj13pLhAYk= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf15.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736698547; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references; bh=nbH+aPU+wOBLCX0H0RkPQ5R+esemDwrgPRAjk7CqDPM=; b=E5MK19mxoHeOsj8dkQcw/nMKjx7+ZMwOVg7C5KIfu/BZaw96al9Zqy51vyM88fLtZWOkei zxHJ4RreeRh4I90pEiKDpsCO6jUxq8cIWzfqsiZeZOEjkpWcprwp/LsUKh4B9vMoEGT5tt THficytWLSN2XXBQOdd5ggUw4azrpnU= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tX0Ii-0000000010W-2VR9; Sun, 12 Jan 2025 10:54:56 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jannh@google.com Subject: [RFC PATCH v4 00/10] AMD broadcast TLB invalidation Date: Sun, 12 Jan 2025 10:53:44 -0500 Message-ID: <20250112155453.1104139-1-riel@surriel.com> X-Mailer: git-send-email 2.47.1 MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 7EF0BA0017 X-Stat-Signature: j3hc6dyx1qtdxta79pfznpygr53387r1 X-HE-Tag: 1736698547-660218 X-HE-Meta: U2FsdGVkX19ekKp2KfUWDzRemWNWl5+pbh3LMLHVnnEC8rojmWfAJQSTyUw19je6FvsOYVGD0OchyYbWGd/ZKk2xdYzHlkewcOaZgOQbv7SS4I2JLEY6ql4aKbvqC/eXTjEJnQ+srr641cdwfrwoS+sBTR5wSypWTx7msKf2cXoBGG0r4VoHWY1SE37dHYooz1kSZsMC34aVSrp7SUZBshazEBNJY8IZFgD/nXBcvvOzcOwblV0OaZ1eUlaegzYFmsXuRf8zntb98W0qpC9XlSNhLKCK2/kjj4l8FFNUEAYn2LhnmSJyJpSp2gb/4+MWQr/fgVTEmmVgH/2Qdzt9FWzKJhsq3PvGzBr1lMwa518jw5746+xr2vY6p/KomhI3Wq2fv6cgNsvOWWXJK1Ldu8oUnt+bGtN2zvZduiZ35CDhEnUwThCyXe96PeMxQMQnXDo8km0ZtJHJFsJCwi5zWEFVAnKnmySt/HmAgiBuNoaO4tUgH/Nj8zweDzmEj0PBcZhp0/GkbovNETzxQmp2uMQuJmgxzVDDVwod/BMIpOpib8vQxdrXgTbYkS4e/weYt5F/d0O+g2/DscpMI8ngueup3OuwTn6Un165vaVuzJBdqK679hneFi1gzIgOOlpl9C0sGgPGRBLrEOxhvg/ChCV7pJN39bPxK9uW6gXzyJcX7y+4MPVEOKdop2YF03b8gvnyO1oO0LvHnNNA6Em1Nho1BVvArTaxWHmQMn5eJ5ATnWLsShBlx+hAgL7gznS7CyFEdOBn06xMSmZrwg+jNnRYGBfI4zirsTa5Rru72r+WblBF3sHTxBS/Myfgpqr9BAu4hqYK9hvXRLVC4C/wMV71MtlQb9s+175WTPXUfkXnkGTPsBsi/2icNHjiKgRqk/Eed6jq7OIoAGr7O2lI5bLvkt4xok/iMocpKZp2780ns2pOxl2f0f13Ep3NBb0EEQ+FmTaDn/3f/nngfsr oCcpsqI6 xAAtvg7Ceu9uEVt54US3bkKlVS/nZmXpzQDyuuezGJgg/Io2ju9es9DOqGDWFYqOj5XqWpB2o9UE8taVJTnvrd5Qxak2lXFb5f/CmLN3KSJ3GGUiGpZGsWMqcMdYtxBSTjySF9OM45l2DCO0dx1hTWGRGSq0dUPTxipzpCJceyQ8Eu60zQM/JU37j5a1bFEIKzcW8XM+FBCUgJ2EAT86gl8lR+S2dhuzIba+soQmYtRbVIhvCTJMdxZ2Oz31oa0W1mKzW/hNxGMiupnwPwqxAKC9b4WHIKIpjHyfV48D9Iqx7VHGAf2cyXhCkSiyyTl/oT0rTwI3TN5vqa03sxoXkdFi1NbfzL1tTT/a9zzR0/pDhkhGbZ6P5Vf56h3gv7+ZAIT9U9+VHmlG5Xtz59HGfSo6IUcrhwbU3xoJUxFsGOaERZL6ixrXl/rXg/i73hdYwczjF X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add support for broadcast TLB invalidation using AMD's INVLPGB instruction. This allows the kernel to invalidate TLB entries on remote CPUs without needing to send IPIs, without having to wait for remote CPUs to handle those interrupts, and with less interruption to what was running on those CPUs. Because x86 PCID space is limited, and there are some very large systems out there, broadcast TLB invalidation is only used for processes that are active on 3 or more CPUs, with the threshold being gradually increased the more the PCID space gets exhausted. Combined with the removal of unnecessary lru_add_drain calls (see https://lkml.org/lkml/2024/12/19/1388) this results in a nice performance boost for the will-it-scale tlb_flush2_threads test on an AMD Milan system with 36 cores: - vanilla kernel: 527k loops/second - lru_add_drain removal: 731k loops/second - only INVLPGB: 527k loops/second - lru_add_drain + INVLPGB: 1157k loops/second Profiling with only the INVLPGB changes showed while TLB invalidation went down from 40% of the total CPU time to only around 4% of CPU time, the contention simply moved to the LRU lock. Fixing both at the same time about doubles the number of iterations per second from this case. Some numbers closer to real world performance can be found at Phoronix, thanks to Michael: https://www.phoronix.com/news/AMD-INVLPGB-Linux-Benefits There was a large amount of feedback and debate around v3 of the series. I have tried to incorporate everybody's feedback, but please let me know if I missed a spot. v4: - Use only bitmaps to track free global ASIDs (Nadav) - Improved AMD initialization (Borislav & Tom) - Various naming and documentation improvements (Peter, Nadav, Tom, Dave) - Fixes for subtle race conditions (Jann) v3: - Remove paravirt tlb_remove_table call (thank you Qi Zheng) - More suggested cleanups and changelog fixes by Peter and Nadav v2: - Apply suggestions by Peter and Borislav (thank you!) - Fix bug in arch_tlbbatch_flush, where we need to do both the TLBSYNC, and flush the CPUs that are in the cpumask. - Some updates to comments and changelogs based on questions.