From patchwork Mon Nov 11 20:55:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13871365 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D23BD3ABF4 for ; Mon, 11 Nov 2024 20:55:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 32DF56B00B8; Mon, 11 Nov 2024 15:55:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2E0016B00CF; Mon, 11 Nov 2024 15:55:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 131836B00B8; Mon, 11 Nov 2024 15:55:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id D9F2D8D0010 for ; Mon, 11 Nov 2024 15:55:12 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id A0703AD401 for ; Mon, 11 Nov 2024 20:55:12 +0000 (UTC) X-FDA: 82775018424.18.62820E3 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf11.hostedemail.com (Postfix) with ESMTP id 580574000D for ; Mon, 11 Nov 2024 20:54:20 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Q+R2RGGX; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of 3LW8yZwYKCBYEGD09x2AA270.yA8749GJ-886Hwy6.AD2@flex--surenb.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3LW8yZwYKCBYEGD09x2AA270.yA8749GJ-886Hwy6.AD2@flex--surenb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731358336; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=YmP4q8GO3tnMB8ElYnP0I9LFxs+1Dml5N/4bHnS7ye8=; b=Iv3pwuj9G8haJpyzfpTMJZgm6MpyKgR7j+C37l6EAqW3huKQQR7Qd+Tf68CKnQWy90X5qi MdJ6ovtElApK2YV4kg63G47VbkcZ2KjlwHjEAq7m43jKYnaxEi/kmTlaiBl/lkg8UJ1ull z63BMH8plDndxKiaSykOrUULww5nOig= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Q+R2RGGX; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of 3LW8yZwYKCBYEGD09x2AA270.yA8749GJ-886Hwy6.AD2@flex--surenb.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=3LW8yZwYKCBYEGD09x2AA270.yA8749GJ-886Hwy6.AD2@flex--surenb.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731358336; a=rsa-sha256; cv=none; b=aI1jz9l+k5RdPiMx7q2bdo4QrnurrNqFWnrnVfW/Arbf9lE/zzQFKM42iT4EhHG7h0O8R0 +ESZEkcdYmdIprMkc/IqSKJZeUI6qunIdQGkPkvHFiIhRuCppmf0VS9kPMiROjdAVGG1KL YTlBlxMrJASB+zkSNU+5BxgvrXWjGN4= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6e32b43e053so64298507b3.1 for ; Mon, 11 Nov 2024 12:55:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1731358510; x=1731963310; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=YmP4q8GO3tnMB8ElYnP0I9LFxs+1Dml5N/4bHnS7ye8=; b=Q+R2RGGXFgW0J1ssS6T+FQ2aiu4yZMdWLBH9R4l0v4TqrOAbtiPNDliiOUAaYpvvK0 FVllPBLfZ9yTyrPioaQrbwvIiwXQZCf0J4yP/uFZNswbIFkWnXeY/oQy8tM+HCmmNE3L WfANJt4IEB2RD7wjVwovamD+v8rNBSZT3sQDiQGdSYS3rczOAd4Lmybt6CL0X5aviE0T lacEGkAlHiwoqmvxnXmOecCFjDCZ6nJDECf1fLdwbWnTgKCvNjlrUxeDgnFaOdQDiRNm HnUmQZ8skMB/OZBnu0B0rFMXIAdRNSL4+o6TRt0lom6NNbeR5VVu+ioNIIay2x6785s8 iZOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731358510; x=1731963310; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=YmP4q8GO3tnMB8ElYnP0I9LFxs+1Dml5N/4bHnS7ye8=; b=G6IFyMvLD1VLv2mRMzsaoil9CewIkO20OFic0z780VB1uTbfLoQr6L9Vzf32ygN6j/ 1s6k1U+lK9iFoBS1VeXWxy6fpQr2e+dHJw1ri0J+O4GNSxW8XZhgLi0wawSSSjH984lC twrLUVH87QyMrgAxcQH0GOc49rzX0MkMi6eRmfc+N21CscE0viLveNCiOIaYP8EyqLQi 5rsH9uIaq97ugZze3JE/nkN7nl99EFToykLRhbN8OqDw4Xc4ppGbvanKKKl8lWZPxBhs to//mfeQMEdXcdxpzBSaw3Qwz51LCUWNt5EyOlD1sKbdf7KrW6XyBhq/XOECthgK30/z XNdQ== X-Forwarded-Encrypted: i=1; AJvYcCVvPd73jxfdAwL5urVlZ0hwcZwMS8CU97ky5LqLjwNp2XKRiqt+PpWi5punaqgHuNZj4hxqJVXy1A==@kvack.org X-Gm-Message-State: AOJu0YwHU00fKf/zmnC0MhjB29E1E9Xer+ez3nASDNYcprzZ9QuqYgcv Cy4mGHhbsspeapC7kdyH/2zyaAjEO5snLMl6y1Z5CqV3YQeydqbu30mAixUloM156RjR44jLZUg nJw== X-Google-Smtp-Source: AGHT+IE8wqP80TXUer3yViZvMPgjxLtInm9y/wNhiXq5ErcsTHmSfYSHmgVvG+YL2RzGuqwNIra/g+r9+7k= X-Received: from surenb-desktop.mtv.corp.google.com ([2a00:79e0:2e3f:8:53af:d9fa:522d:99b1]) (user=surenb job=sendgmr) by 2002:a25:dc4a:0:b0:e25:5cb1:77d8 with SMTP id 3f1490d57ef6-e337f8ed8bbmr10011276.6.1731358509875; Mon, 11 Nov 2024 12:55:09 -0800 (PST) Date: Mon, 11 Nov 2024 12:55:02 -0800 Mime-Version: 1.0 X-Mailer: git-send-email 2.47.0.277.g8800431eea-goog Message-ID: <20241111205506.3404479-1-surenb@google.com> Subject: [PATCH 0/4] move per-vma lock into vm_area_struct From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 580574000D X-Stat-Signature: 8pjykcoizp17bswucoaeh5r9jy5s7oqm X-Rspam-User: X-HE-Tag: 1731358460-562889 X-HE-Meta: U2FsdGVkX18qcZEdX0LxnPefELxA+BiAbHnoSTa7BnOPVMf/1C4r/OdvEdasz/Jv7IAAW2ssJsD8w0k8Rq+6kAqy6LfQu2RpUwvR4kIGAK2pDKzESOb6d71dGI5YOILSaNt8P81bPbdIeWg11nx6QCc0/DMdwJ+knybBuVlohvEvPTW9D+8mbOpYOrWmTdmpz0YxhU1UAXiGXO1lt7vqHZkmLermNQCg9rLZv7dC4G1LNuSHovjAPA9VTN7EHcZIYVwwBS4z7o3JamBCzVTHp7LYyhPB/XG47fWVAIm3e7/vQXoAzmH9uGTKsdkehOQZmhrgL+EAl8JT+P8Bwyam0iVS1tCFL1CsJHVA6bBCCZOcEAnXDY38cuxcflTwuRuwjEqOpPdw+2TaTnPZOnUYHQFffXtzmfX/qZGN+y2a90pBQ/OSlkST8S2LIR4uu2gFLv28BMOSf7SeRa8Yt+aEZ6jdnVxGgVodFKcE1vPueVRbN/lBuUgjhrEXLEsrHAAAhzbHErMj44LIUF6ZdNir9SY76XV+J/5yDhD9gOI8nxIRbKAmsrk2bkkMLOD715oekJ62NE8cIOvSXtTloRdqbPf0P9inWmV0oWc5AehstIup65HlAGljq5piRJGS8XiWixZ0Gb5hNGdLejT1Be6sf+2ZXJPjLYkgNRyE4MQP6l+6bW/QD3C2/9mmAVT7KvxqKnbtlugIVXoQIN3W5JISizqf7vXA/sTcSzTp13YXMmv56zevaEnCabuKeXMK6tzhvGyDDUXQGUsggsrOonLAZY8myEbg9+6xhxBqcVFCfwhG3ymDR2D9UzoA5YvSMU3g4h3hR3AoFM0j6VhuqNmscFelmlhN01Xe3ibn0ym4L6XVmAemkdKgCUgwY4xcrYAbY5CtC2QzSzgme8OhtSrKtU9V7QoQ+W1FQ4olSiBMfY3+yf1gVaEEdAJdQ0uykmQJrQkdQogXy/HN9zFWQLe Yl49rrSq GpTwxu6Jdg/UhICGFz9OjxddrM8vc+8CsRqjH7CEgnDg7RMjYfOG29KZ3wPyOziI3UykH2ctMFB+y0dXjU5Ec3CE22tRV86/bJSGr4Avu+45pfQo2bZ4Lim9KS5WfQnzCbloGdhhcQU22DDcU2zJu862m3QkMkBbt6MWt1pbO9iCbUZHLkox7BdfGPcGUHwFJJbAQ94CsyI1wu6ET5DTgOWUChc9bLepHlFYJA0w1bfHcQiTU6ZENX70OTVtdqSEUBQWY9doKPXevlOpVY9Yuv2lmvkMVcyoyg3TNhWc3uvx2BYR/b9uF9iQaUls4nh66p0xFDmollKsYvO7Hl1vQfaLXNMP6AgK2stGfGxQLyWZfIIBqNluxRpOUClDS3OgxAS8f1JRDGsTKasKzH9iRBWD03nbukdEpKLH3QhPjut1kpwVrlgsMIxAM9P9HEcnODYBKWb398IzysfXTYmk9Hlwuxf5G1nfM7ccXP4tmZAM3xeKuG2VR3Bi8uS8vu7PXwQ/ret4shFHz4+N/9lwHP/h6i8WKlfE+04HS+Amtlj3wbGufkgFYZ8812Flg0XCCJ5aJAgeH1afKIRTkISmw+uG0UksjiIqWGtwJ0S8PcAjFqy8nLCddDX3GqaDD5qeh0BYv X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Back when per-vma locks were introduces, vm_lock was moved out of vm_area_struct in [1] because of the performance regression caused by false cacheline sharing. Recent investigation [2] revealed that the regressions is limited to a rather old Broadwell microarchitecture and even there it can be mitigated by disabling adjacent cacheline prefetching, see [3]. This patchset moves vm_lock back into vm_area_struct, aligning it at the cacheline boundary and changing the cache to be cache-aligned as well. This causes VMA memory consumption to grow from 160 (vm_area_struct) + 40 (vm_lock) bytes to 256 bytes: slabinfo before: ... : ... vma_lock ... 40 102 1 : ... vm_area_struct ... 160 51 2 : ... slabinfo after moving vm_lock: ... : ... vm_area_struct ... 256 32 2 : ... Aggregate VMA memory consumption per 1000 VMAs grows from 50 to 64 pages, which is 5.5MB per 100000 VMAs. To minimize memory overhead, vm_lock implementation is changed from using rw_semaphore (40 bytes) to an atomic (8 bytes) and several vm_area_struct members are moved into the last cacheline, resulting in a less fragmented structure: struct vm_area_struct { union { struct { long unsigned int vm_start; /* 0 8 */ long unsigned int vm_end; /* 8 8 */ }; /* 0 16 */ struct callback_head vm_rcu ; /* 0 16 */ } __attribute__((__aligned__(8))); /* 0 16 */ struct mm_struct * vm_mm; /* 16 8 */ pgprot_t vm_page_prot; /* 24 8 */ union { const vm_flags_t vm_flags; /* 32 8 */ vm_flags_t __vm_flags; /* 32 8 */ }; /* 32 8 */ bool detached; /* 40 1 */ /* XXX 3 bytes hole, try to pack */ unsigned int vm_lock_seq; /* 44 4 */ struct list_head anon_vma_chain; /* 48 16 */ /* --- cacheline 1 boundary (64 bytes) --- */ struct anon_vma * anon_vma; /* 64 8 */ const struct vm_operations_struct * vm_ops; /* 72 8 */ long unsigned int vm_pgoff; /* 80 8 */ struct file * vm_file; /* 88 8 */ void * vm_private_data; /* 96 8 */ atomic_long_t swap_readahead_info; /* 104 8 */ struct mempolicy * vm_policy; /* 112 8 */ /* XXX 8 bytes hole, try to pack */ /* --- cacheline 2 boundary (128 bytes) --- */ struct vma_lock vm_lock (__aligned__(64)); /* 128 4 */ /* XXX 4 bytes hole, try to pack */ struct { struct rb_node rb (__aligned__(8)); /* 136 24 */ long unsigned int rb_subtree_last; /* 160 8 */ } __attribute__((__aligned__(8))) shared; /* 136 32 */ struct vm_userfaultfd_ctx vm_userfaultfd_ctx; /* 168 0 */ /* size: 192, cachelines: 3, members: 17 */ /* sum members: 153, holes: 3, sum holes: 15 */ /* padding: 24 */ /* forced alignments: 3, forced holes: 2, sum forced holes: 12 */ } __attribute__((__aligned__(64))); Memory consumption per 1000 VMAs becomes 48 pages, saving 2 pages compared to the 50 pages in the baseline: slabinfo after vm_area_struct changes: ... : ... vm_area_struct ... 192 42 2 : ... Performance measurements using pft test on x86 do not show considerable difference, on Pixel 6 running Android it results in 3-5% improvement in faults per second. [1] https://lore.kernel.org/all/20230227173632.3292573-34-surenb@google.com/ [2] https://lore.kernel.org/all/ZsQyI%2F087V34JoIt@xsang-OptiPlex-9020/ [3] https://lore.kernel.org/all/CAJuCfpEisU8Lfe96AYJDZ+OM4NoPmnw9bP53cT_kbfP_pR+-2g@mail.gmail.com/ Suren Baghdasaryan (4): mm: introduce vma_start_read_locked{_nested} helpers mm: move per-vma lock into vm_area_struct mm: replace rw_semaphore with atomic_t in vma_lock mm: move lesser used vma_area_struct members into the last cacheline include/linux/mm.h | 163 +++++++++++++++++++++++++++++++++++--- include/linux/mm_types.h | 59 +++++++++----- include/linux/mmap_lock.h | 3 + kernel/fork.c | 50 ++---------- mm/init-mm.c | 2 + mm/userfaultfd.c | 14 ++-- 6 files changed, 205 insertions(+), 86 deletions(-) base-commit: 931086f2a88086319afb57cd3925607e8cda0a9f