From patchwork Thu Aug 10 20:49:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13349920 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 691ADC001DE for ; Thu, 10 Aug 2023 20:49:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BAB886B0071; Thu, 10 Aug 2023 16:49:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B5C1C6B0072; Thu, 10 Aug 2023 16:49:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A22D66B0074; Thu, 10 Aug 2023 16:49:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 943DC6B0071 for ; Thu, 10 Aug 2023 16:49:51 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 61943401AE for ; Thu, 10 Aug 2023 20:49:51 +0000 (UTC) X-FDA: 81109386582.08.8B20AE3 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf22.hostedemail.com (Postfix) with ESMTP id 3B1C5C0027 for ; Thu, 10 Aug 2023 20:49:49 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=aKVOUzD5; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf22.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1691700589; a=rsa-sha256; cv=none; b=O79DVwZV+LOgx+fzVE+ZGsgoeNMw6iGzAn1hil2yxJu3ywrsRq2KEh+yzMGbE0LFLNlLMh MndmzKPN6bQbGkLFqk6WbISXHav73q9SKi11LZYNO6FvryJ10ouBmA+r61kKdbJV50n0Uh f78GN8jaGk4VQr3dCYSBRDhPzeRrUlY= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=aKVOUzD5; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf22.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1691700589; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=twG+q7cbfuF1o8nPNGdWUOVnURcUNrjYBfUqioHjJMs=; b=WgCLJIFOBOBRYJLYfgo6UMfL2TGrEptDP5pAk/rBPOVR8paJ9+wYvIQBOPbyQxMZxTC7z3 4HaOiP8mI+yUq/3uSPaJxulMjxQZGanuxwa+PhqJqMBdIpOMgh/OCESvQeALMK914m99xy kbSaqnHTh1hrZ8XIh0kMN/Gyp9qd4yQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1691700588; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=twG+q7cbfuF1o8nPNGdWUOVnURcUNrjYBfUqioHjJMs=; b=aKVOUzD5oFkTcrN3kG7tLhD5mdOjhs/TOQFytoXUq6bG9Yf+6cRvj17FvF2PPlmfZ67BGx MBA/sD+ILWYskpMqrVDCQBf9k7D3mf9Tm4lki5LVH9hEoHVPlUlTOXRaP863znKB3i2het 5rdJgjzJpmk9c4QrVPasvAVonfDFRj4= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-441-rq1MTDVFOdagIT-GpTQing-1; Thu, 10 Aug 2023 16:49:47 -0400 X-MC-Unique: rq1MTDVFOdagIT-GpTQing-1 Received: by mail-qt1-f200.google.com with SMTP id d75a77b69052e-40ec97d5b60so3219331cf.1 for ; Thu, 10 Aug 2023 13:49:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691700586; x=1692305386; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=twG+q7cbfuF1o8nPNGdWUOVnURcUNrjYBfUqioHjJMs=; b=M9pm5mOI2TqFi3obg2kFuzWF1FTFe5ky/RUb6qxDk9Jfbv8lj6zNprBzUjZjPSIm4Q yTrd6FjSyooggVo/+/ENNVZ/6bRC7+yjISh8dAyPe4O5tLESNe8m7ErdZ9KIFXwhNVly 6E0nyKIWP40qrNEnzWRbFw9GlY/n6FWdsQEKe3cYeYppWG3qYbaq08VozkrSbGNNAOjf j6AmEWhWdPvoS1NGl/MIeqRdp+NywMmBP8b8WV2wpM74fIuyt3ooRCtWijOGpNBGDZz2 3Bn6wZoAmk7Neyr9s7/eu1xf+tqBXhe3ctaJ54oaC2QdNU+0JHBzWvbzSxDGHzXwg1sW 6Ybw== X-Gm-Message-State: AOJu0Yxh7pJ+tCOoxqRF+Q0oMS213z2H6pKxrtY99SGKsOnIpH+C9Z5x pz4Cu3lvv1I+XuulswLv561/i5Y8M/N5wphggF0W2QJvvRDVacZ3q+deXuP2KWLL2Ev+Z3NmCHR UytFJip1qxOeih/rxxKKjwCXC5QwZFzK6MBqxxnuM43Nuc0LS5X02Reyd/1Bi3YXBuEAL X-Received: by 2002:a05:6214:27ed:b0:635:d9d0:cccf with SMTP id jt13-20020a05621427ed00b00635d9d0cccfmr4044683qvb.4.1691700586476; Thu, 10 Aug 2023 13:49:46 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHmW/SINYdRW0UebQer9MB54j7EytzBwLKLgVYOQdYahy9ap+ZF0QtarVBAYlcOErHJsaif0g== X-Received: by 2002:a05:6214:27ed:b0:635:d9d0:cccf with SMTP id jt13-20020a05621427ed00b00635d9d0cccfmr4044654qvb.4.1691700586007; Thu, 10 Aug 2023 13:49:46 -0700 (PDT) Received: from x1n.. (cpe5c7695f3aee0-cm5c7695f3aede.cpe.net.cable.rogers.com. [99.254.144.39]) by smtp.gmail.com with ESMTPSA id t12-20020a0cde0c000000b0062fffa42cc5sm736170qvk.79.2023.08.10.13.49.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Aug 2023 13:49:45 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Yu Zhao , peterx@redhat.com, Mike Kravetz , Yang Shi , Andrew Morton , "Kirill A . Shutemov" , Hugh Dickins , David Hildenbrand , Ryan Roberts , Matthew Wilcox Subject: [PATCH RFC] mm: Properly document tail pages for compound pages Date: Thu, 10 Aug 2023 16:49:44 -0400 Message-ID: <20230810204944.53471-1-peterx@redhat.com> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 3B1C5C0027 X-Stat-Signature: c9s98fzq1ukga5bhyn39smngtpwyrbq6 X-HE-Tag: 1691700589-304343 X-HE-Meta: U2FsdGVkX1+yIT36u6T1+2qb/BAdTPBBAEWaFPRiUMjtndtOxUR7URSPAUcrBGGyZ+i2L0P4xzDEIFCD4cAr56tq1uvvx2ytF+kOiu+8guzQganslNCwefRbOAfDeKDTMdbf26aonJwiiPUxJ0Gr6HmeMo2KlxcyRJ4GQ80IycSa9F/04DON4m9M14GVTwqabJMF+6rezhd210sTLlkGaAotRMBPiUNfcVLsiuAGp/pBUXdP8W6qhxbSX371+H6o7TCMAIiFB+wqHi6ZQT6HaFBFEauwoqRnAl1hSFmBPrc+YQiIFiLiVnuXY3OLypVTjUJf351Ffc/E5owmFlshufr4Ewrwlm3pJk2/CfVpqT7gMMZ86sD3qsb/2s2o8EgVtZAtFrMpUIR3l9SBKUsDjos2ybridJprr6msQQ8q1O199I8vGGNDtUF7Q00EomvB+s7o7NQ1zf980GAnbch/EaINsZHeeLz7/4XCUOe3tZyXrjkiKYegBaoD+COQyl0NyHMjc+BpBExeY3OG36b2pJ7ccbGANUul0+SfeH2LmucHi56P69akSUN7/+2HA443JFTk23S1TKvhjMNbE9i9MPruwpvNkLkLrGflgP8+RAn/H3sOrnxvyxnncL4PUyEuRWCxFl8ehr7CvWtk4mSUvCIb1wGCXVrOxzi9X25qOGh+Ia/px+kzr0BIPu5tupnQzw1/+oNzBFVN3BYjH9B1hTFDYUQef/52ZVtRqrUuOumuZuj4/NMiI1p+HWpTtgcAKZJuSyizFyTScDX+z0Tfh2JjqmXYKwxdEmSOToEpfVihymkEXpD+zLQPeA4esQbWhO5Jq/5FP4R9NPYjYtKTLu+kmYopBfpIlG1Gi5BABM/xZU9R+7E8sHvs4CHh+XVjGXG1wFYI2eQjq/W4U3KR1STGNp6HSavF7Et+ySRcark0lbvdJ4U/9vstSZ1P+nsc0/dznlHwFRX6f0ZLILT TKNEdOo7 FFRQcbWeFZ9QNPxjQZFb8eG4bsl7hsO+8vqHFrbunbz24vzckDp4+bkxRXV6/IDCgwgbEO5hb8YBvIfKNwc+FxotTttx5b9/oGeKWE9eE7lWGH6b76DuX9QlCQz1x+aCnG9rwh4ziFPKzZfLucZ+Gguw2hdXCUhfgZY/uagAPXgiFaWAfZZRMHId4Brh3LEgGUskATndGYG/Wefsi7B6iU18XH6S+2YVL7+Ao5HDkfAArAG1b7CE2pKbGTtMm314NpZYtup+/cqsMGRALgegKtoc9XLkq27ieJSBe96hCMv2Nbx1mmJInJvj9PDdAkqD3W+ridzLJhXGZ+akij4WGJISQUOufhsIuFJ16HPsabLiohl3AOZbk1Gplu44M7Am3Gq7xjzLhRIRvSnI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Tail page struct reuse is over-comlicated. Not only because we have implicit uses of tail page fields (mapcounts, or private for thp swap support, etc., that we _may_ still use in the page structs, but not obvious the relationship between that and the folio definitions), but also because we have 32/64 bits layouts for struct page so it's unclear what we can use and what we cannot when trying to find a new spot in folio struct. We also have tricks like page->mapping, where we can reuse only the tail page 1/2 but nothing more than tail page 2. It is all mostly hidden, until someone starts to read into a VM_BUG_ON_PAGE() of __split_huge_page_tail(). Let's document it clearly on what we can use and what we can't, with 100% explanations on each of them. Hopefully this will make: (1) Any reader to know exactly what field is where and for what, the relationships between folio tail pages and struct page definitions, (2) Any potential new fields to be added to a large folio, so we're clear which field one can still reuse (look for _reserved* ones). This is assuming WORD is defined as sizeof(void *) on any archs, just like the other comment in struct page we already have. One pitfall is I'll need to split part of the tail page 1 definition into 32/64 bits differently, that introduced some duplications on the fields. But hopefully that's worthwhile as it makes everything crystal clear. Not to mention that "pitfall" also brings a benefit that we can actually define fields in different order for 32/64 bits when we want. Signed-off-by: Peter Xu --- include/linux/mm_types.h | 76 +++++++++++++++++++++++++++++++++++----- 1 file changed, 67 insertions(+), 9 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 291c05cacd48..3e40e1b9fec3 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -313,41 +313,99 @@ struct folio { }; struct page page; }; + /* + * Some of the tail page fields (out of 8 WORDs for either 32/64 + * bits archs) may not be reused by the folio object because + * they're already been used by the page struct: + * + * |-------+---------------| + * | Index | Field | + * |-------+---------------| + * | 0 | flag | + * | 1 | compound_head | + * | 2 | N/A [0] | + * | 3 | mapping [1] | + * | 4 | N/A [0] | + * | 5 | private [2] | + * | 6 | mapcount | + * | 7 | N/A [0] | + * |-------+---------------| + * + * [0] "N/A" marks fields that are available to leverage for the + * large folio. + * + * [1] "mapping" field is only used for sanity check, see + * TAIL_MAPPING. Still valid to use for tail pages 1/2. + * (for that, see __split_huge_page_tail()). + * + * [2] "private" field is used when THP_SWAP is on (disabled on 32 + * bits, or on hugetlb folios) . + */ union { struct { + /* WORD 0-1: not valid to reuse */ unsigned long _flags_1; unsigned long _head_1; - /* public: */ + /* WORD 2 */ unsigned char _folio_dtor; unsigned char _folio_order; + unsigned char _holes_1[2]; +#ifdef CONFIG_64BIT atomic_t _entire_mapcount; + /* WORD 3 */ atomic_t _nr_pages_mapped; atomic_t _pincount; -#ifdef CONFIG_64BIT + /* WORD 4 */ unsigned int _folio_nr_pages; + unsigned int _reserved_1_1; + /* WORD 5-6: not valid to reuse */ + unsigned long _used_1_2[2]; + /* WORD 7 */ + unsigned long _reserved_1_2; +#else + /* WORD 3 */ + atomic_t _entire_mapcount; + /* WORD 4 */ + atomic_t _nr_pages_mapped; + /* WORD 5: only valid for 32bits */ + atomic_t _pincount; + /* WORD 6: not valid to reuse */ + unsigned long _used_1_2; + /* WORD 7 */ + unsigned long _reserved_1; #endif - /* private: the union with struct page is transitional */ }; + /* private: the union with struct page is transitional */ struct page __page_1; }; union { struct { + /* WORD 0-1: not valid to reuse */ unsigned long _flags_2; unsigned long _head_2; - /* public: */ + /* WORD 2-5 */ void *_hugetlb_subpool; void *_hugetlb_cgroup; void *_hugetlb_cgroup_rsvd; void *_hugetlb_hwpoison; - /* private: the union with struct page is transitional */ + /* WORD 6: not valid to reuse */ + unsigned long _used_2_2; + /* WORD 7: */ + unsigned long _reserved_2_1; }; struct { - unsigned long _flags_2a; - unsigned long _head_2a; - /* public: */ + /* WORD 0-1: not valid to reuse */ + unsigned long _used_2_3[2]; + /* WORD 2-3: */ struct list_head _deferred_list; - /* private: the union with struct page is transitional */ + /* WORD 4: */ + unsigned long _reserved_2_2; + /* WORD 5-6: not valid to reuse */ + unsigned long _used_2_4[2]; + /* WORD 7: */ + unsigned long _reserved_2_3; }; + /* private: the union with struct page is transitional */ struct page __page_2; }; };