From patchwork Fri Jun 7 09:09:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13689520 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04A40C27C5F for ; Fri, 7 Jun 2024 09:09:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F056D6B00A6; Fri, 7 Jun 2024 05:09:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EB3A06B00A9; Fri, 7 Jun 2024 05:09:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D53E46B00AB; Fri, 7 Jun 2024 05:09:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id BA5306B00A6 for ; Fri, 7 Jun 2024 05:09:57 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 7C7251A177B for ; Fri, 7 Jun 2024 09:09:57 +0000 (UTC) X-FDA: 82203520434.21.367E385 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf10.hostedemail.com (Postfix) with ESMTP id BBE5CC000D for ; Fri, 7 Jun 2024 09:09:55 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=QD+SPR3S; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf10.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717751395; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kiFnKpy9jAQsfmTIBaaW+Ib0u1X/H+DkqsqDykpaDbc=; b=dYvVAZYueUzVgquGrhhMPyBM61q5sqQI+WdWrq2skI29qwIkmc6iZWlONKkbrS8Nytqbn1 YkCsX9V/qZqc+1enhv1KPIw1rV7784Ob545wyz7ne8h01yHWkXpEq3QKpAoRHn+3Dw1urS TeadPKuybyCYbo7sJpAMB9tWoCKIpyw= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=QD+SPR3S; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf10.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717751395; a=rsa-sha256; cv=none; b=6gSGDcehv9FlDg/U+uoSO+dmeTf+CtPdhOZ1P30WaBsS2Ks1hVII/cdVuIP7yTcoYWPqF+ qVmyJrzS1L2Z+a8fkWTSW6LxvYHydaADMuI6EdyQScE09DOcONCwWzqZGaezDs5AJffqcy xjG9FQsb2h0f3DFfxKv/RlC5R9W3TDU= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1717751395; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kiFnKpy9jAQsfmTIBaaW+Ib0u1X/H+DkqsqDykpaDbc=; b=QD+SPR3SeCgHHTvpOwvI0Gbs7jSZhFvuOKnc3C1p5qzylR9B5ynIEvKwtNj7no2Z/s/2ZD iM8chHCc0XTs3iD+gn9hB9JSYhoD6ih0OMxv+Bl2Kv4UddWH4540LpRq9s5MrLZvilUjwC jfPfjX30ggHT0/Ky5/mGa87AMCAc25w= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-94-siZVc8J6MdebVKtoCSUwSQ-1; Fri, 07 Jun 2024 05:09:51 -0400 X-MC-Unique: siZVc8J6MdebVKtoCSUwSQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C59D31C05190; Fri, 7 Jun 2024 09:09:50 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.194.94]) by smtp.corp.redhat.com (Postfix) with ESMTP id 44DB337E7; Fri, 7 Jun 2024 09:09:46 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-hyperv@vger.kernel.org, virtualization@lists.linux.dev, xen-devel@lists.xenproject.org, kasan-dev@googlegroups.com, David Hildenbrand , Andrew Morton , Mike Rapoport , Oscar Salvador , "K. Y. Srinivasan" , Haiyang Zhang , Wei Liu , Dexuan Cui , "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , =?utf-8?q?Eugenio_P=C3=A9rez?= , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko , Alexander Potapenko , Marco Elver , Dmitry Vyukov Subject: [PATCH v1 1/3] mm: pass meminit_context to __free_pages_core() Date: Fri, 7 Jun 2024 11:09:36 +0200 Message-ID: <20240607090939.89524-2-david@redhat.com> In-Reply-To: <20240607090939.89524-1-david@redhat.com> References: <20240607090939.89524-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 X-Rspamd-Queue-Id: BBE5CC000D X-Stat-Signature: yt5brbwpq57d71z59wd33qhka3sxicak X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1717751395-957805 X-HE-Meta: U2FsdGVkX18ZZTGFaxMicaYiMDD0sKvd3HUnMrekrHE6fDaq8SFbexU4rKzmaOLvacaJYa2mh9cOWV7qR7xZco7ieEcvGBRjO4q+CO0A08PtPfOhik2R7nKnNX83TuuGw8Vjv9q7J7C41TV3ej3fEHnJfMgXacbvy56TQ1jl8eqJq8foZp3C7P7dF1oTYy+HuH+s1sTbtSsbM+VXe5kVuU6ygk0XoGQkhGwZ7zPzrtan+iqTwQpTv/Tq0Gt6obqKLpOb2Qf39cjtD9a38DpkrpWI64io6uJhgM+Vjnzt1Heo5ZC+6UMXs/qPpfU8kDlBoMXmt7LEyMR/1qtAlCIS9LXWLpZ1otaajM97oKNDoplCEtXlCAQEYDZUpDe3esjzqOq7vZwzTKnRopMRdutAE+wwWwpsclq8CzQb8dDuDbKSxq6nYCOt117y/NJ8nTehd72ts7LbsxSCBu6+kVlX77z76G6ivDiy2OCrpR9aNO4WcM2VXw3RcTQ6349uvLjM1aet5GI8VpkJfckjX6kTsCBBROvoel9eQd4mWv0EYsEOCO4QUNhDjink7ZY3of2EB4RCF8Px+6AvKbi8atFYGka8VmHH17VjIN2zUJTb2ZtaYQrFoJ6wxw0JI3bJJGaO0AMGqQeCIc4jXxw2ll1qRJtiB/OEimKZoWGAX+mDL5w19PzdZXn4LU8BIAYtc4mUL9KQmVA1fxuewutVICt9JZ/2DWXfcKBFiyyrfUz/EsfA7UxobkzkfoccERV1wPIqLmsdhYYlLl8uiONTWejU2Z5kJwg8nKKT3/JRsC+Zj89AJBdj7nH1SLGjxXQwSZFdA+wSb4jt0Q9+IOutpKXfLPhAnTceoeGpY0chBz99UeX5mig+O6hqUKHyRH4OKUhT5xmQabbnZf5G9LB7xSerm9tdlGh5bXv52V+MQXcqaZeriml5Fg8Aq5xxLUCthfP9yVXLki+Usj6aAeCc6v8 f80I7OtO ++EvwYwuDwIExVOmdno6pBa5BnwvcjvMuzDUc/t03HBfwnCg8WYsySlO6Cwi809wcNS8Ra2bXxaPPc4pEQreJx0N+iaZImlrO6pMP9tKrsYFVAWoGE0nhNz3oOYZaOhAibBSFxuwaM/U3G5u2pFl38QfuhXdP9egfz1TZS7APKHLSPQu1IOCbWDLptzev6WQd6UHkyrPhyTEgfk2i6iPwD99OmKmtx5sPTNnT X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In preparation for further changes, let's teach __free_pages_core() about the differences of memory hotplug handling. Move the memory hotplug specific handling from generic_online_page() to __free_pages_core(), use adjust_managed_page_count() on the memory hotplug path, and spell out why memory freed via memblock cannot currently use adjust_managed_page_count(). Signed-off-by: David Hildenbrand Signed-off-by: David Hildenbrand Signed-off-by: David Hildenbrand --- mm/internal.h | 3 ++- mm/kmsan/init.c | 2 +- mm/memory_hotplug.c | 9 +-------- mm/mm_init.c | 4 ++-- mm/page_alloc.c | 17 +++++++++++++++-- 5 files changed, 21 insertions(+), 14 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 12e95fdf61e90..3fdee779205ab 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -604,7 +604,8 @@ extern void __putback_isolated_page(struct page *page, unsigned int order, int mt); extern void memblock_free_pages(struct page *page, unsigned long pfn, unsigned int order); -extern void __free_pages_core(struct page *page, unsigned int order); +extern void __free_pages_core(struct page *page, unsigned int order, + enum meminit_context); /* * This will have no effect, other than possibly generating a warning, if the diff --git a/mm/kmsan/init.c b/mm/kmsan/init.c index 3ac3b8921d36f..ca79636f858e5 100644 --- a/mm/kmsan/init.c +++ b/mm/kmsan/init.c @@ -172,7 +172,7 @@ static void do_collection(void) shadow = smallstack_pop(&collect); origin = smallstack_pop(&collect); kmsan_setup_meta(page, shadow, origin, collect.order); - __free_pages_core(page, collect.order); + __free_pages_core(page, collect.order, MEMINIT_EARLY); } } diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 171ad975c7cfd..27e3be75edcf7 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -630,14 +630,7 @@ EXPORT_SYMBOL_GPL(restore_online_page_callback); void generic_online_page(struct page *page, unsigned int order) { - /* - * Freeing the page with debug_pagealloc enabled will try to unmap it, - * so we should map it first. This is better than introducing a special - * case in page freeing fast path. - */ - debug_pagealloc_map_pages(page, 1 << order); - __free_pages_core(page, order); - totalram_pages_add(1UL << order); + __free_pages_core(page, order, MEMINIT_HOTPLUG); } EXPORT_SYMBOL_GPL(generic_online_page); diff --git a/mm/mm_init.c b/mm/mm_init.c index 019193b0d8703..feb5b6e8c8875 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1938,7 +1938,7 @@ static void __init deferred_free_range(unsigned long pfn, for (i = 0; i < nr_pages; i++, page++, pfn++) { if (pageblock_aligned(pfn)) set_pageblock_migratetype(page, MIGRATE_MOVABLE); - __free_pages_core(page, 0); + __free_pages_core(page, 0, MEMINIT_EARLY); } } @@ -2513,7 +2513,7 @@ void __init memblock_free_pages(struct page *page, unsigned long pfn, } } - __free_pages_core(page, order); + __free_pages_core(page, order, MEMINIT_EARLY); } DEFINE_STATIC_KEY_MAYBE(CONFIG_INIT_ON_ALLOC_DEFAULT_ON, init_on_alloc); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2224965ada468..e0c8a8354be36 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1214,7 +1214,8 @@ static void __free_pages_ok(struct page *page, unsigned int order, __count_vm_events(PGFREE, 1 << order); } -void __free_pages_core(struct page *page, unsigned int order) +void __free_pages_core(struct page *page, unsigned int order, + enum meminit_context context) { unsigned int nr_pages = 1 << order; struct page *p = page; @@ -1234,7 +1235,19 @@ void __free_pages_core(struct page *page, unsigned int order) __ClearPageReserved(p); set_page_count(p, 0); - atomic_long_add(nr_pages, &page_zone(page)->managed_pages); + if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG) && + unlikely(context == MEMINIT_HOTPLUG)) { + /* + * Freeing the page with debug_pagealloc enabled will try to + * unmap it; some archs don't like double-unmappings, so + * map it first. + */ + debug_pagealloc_map_pages(page, nr_pages); + adjust_managed_page_count(page, nr_pages); + } else { + /* memblock adjusts totalram_pages() ahead of time. */ + atomic_long_add(nr_pages, &page_zone(page)->managed_pages); + } if (page_contains_unaccepted(page, order)) { if (order == MAX_PAGE_ORDER && __free_unaccepted(page)) From patchwork Fri Jun 7 09:09:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13689521 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1B89C27C55 for ; Fri, 7 Jun 2024 09:10:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7FFB96B00B5; Fri, 7 Jun 2024 05:10:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7AF706B00B6; Fri, 7 Jun 2024 05:10:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 651516B00B7; Fri, 7 Jun 2024 05:10:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 3FD0A6B00B5 for ; Fri, 7 Jun 2024 05:10:06 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id ECCB11A03DD for ; Fri, 7 Jun 2024 09:10:05 +0000 (UTC) X-FDA: 82203520770.16.A461182 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf11.hostedemail.com (Postfix) with ESMTP id 3D8B340009 for ; Fri, 7 Jun 2024 09:10:04 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=i3l+1xbW; spf=pass (imf11.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717751404; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mc4++sRx3ukGDekZxB1N0n7JzUxaxxhg3TlGVFiWWaQ=; b=v3XBwAqEZDK/jcW6OQ1P+Sqtm4+bniyplXA8GM0fryKtcWyMbSgYkMAztzbk5lW8xA4KFI gNB/AtMK2IUHJyeTylfk84by2KLibpwpDlrjeZRQ+4oghaGwb7pfbd0eqOIWb1LnFIB3mq ZLp1D8kSQxm/DjUNonStue6EJsvYiQI= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=i3l+1xbW; spf=pass (imf11.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717751404; a=rsa-sha256; cv=none; b=vyar/cwyVTwWqlrwo2irC6xI2wLJv/EJb/yk/sG+u/cCwPc8JbD0zD0WGwb/E9YDg4SAlD S+p1m3bctN/lk74qpydyQrwps4oCwErTMGZiqccfbBIVHZNI3xysLFYfPOTj5K/QTdpn6b 4CSQarxOyrzKv3QMKEf4ZmIYcm9r21M= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1717751403; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mc4++sRx3ukGDekZxB1N0n7JzUxaxxhg3TlGVFiWWaQ=; b=i3l+1xbWhuleJwBKwwSrDqP43KsQnfifvtyAnYuVufCcL0d0NFBBHXPiLH0MHGUKkDxvfj WWrtixlSiV3Fj/7e8bSJYMfx5LkRmpSLW0vEc4HIqqBnUlpQeOlJpHndGG82/ltRA9tDGN XsuZBbOBpONWjoqPTYqFHPyaYQlzwsg= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-606-eUzDhlzTMDqdmYuPAblQDA-1; Fri, 07 Jun 2024 05:09:56 -0400 X-MC-Unique: eUzDhlzTMDqdmYuPAblQDA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 964D43C01221; Fri, 7 Jun 2024 09:09:55 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.194.94]) by smtp.corp.redhat.com (Postfix) with ESMTP id 30BEC37E5; Fri, 7 Jun 2024 09:09:51 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-hyperv@vger.kernel.org, virtualization@lists.linux.dev, xen-devel@lists.xenproject.org, kasan-dev@googlegroups.com, David Hildenbrand , Andrew Morton , Mike Rapoport , Oscar Salvador , "K. Y. Srinivasan" , Haiyang Zhang , Wei Liu , Dexuan Cui , "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , =?utf-8?q?Eugenio_P=C3=A9rez?= , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko , Alexander Potapenko , Marco Elver , Dmitry Vyukov Subject: [PATCH v1 2/3] mm/memory_hotplug: initialize memmap of !ZONE_DEVICE with PageOffline() instead of PageReserved() Date: Fri, 7 Jun 2024 11:09:37 +0200 Message-ID: <20240607090939.89524-3-david@redhat.com> In-Reply-To: <20240607090939.89524-1-david@redhat.com> References: <20240607090939.89524-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 X-Stat-Signature: jmb5f1of1b6j4z917nnesqm9c5efo18i X-Rspamd-Queue-Id: 3D8B340009 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1717751404-89988 X-HE-Meta: U2FsdGVkX1/gYPk9O600AIuO/vlJMgDibqUtnGzdkG4ewv2x90EB1Rlap4aAlQJ1M+itn5bqptWTtX/H2Bp0yaluvhBhciv9HL6whAS6A4jTXkliYQBb6/YZojvO4nvotCoYQgiGAdqnteCFnDriFaAwvRxfOcwy8cHF15U3pGrAkLWhC45YQ4vii50kda3OIv0Ot31KQcVbe7xPrB902QwmvQfhqZxotZukmRu6pmlzu3NBqQd+An3lJu406pBiznNcdShte4sOSL3TSp8w1t6H/QqU8L4U5Kx5Di3aOsZU6M8MO5Vt6CbkRYazwk/wwsVXguki3xcT3xEMK+VXu1BAI2VViXVUHapkW3/EELa7ek4DbT4tolhuLJ2MG1xEIF11u8Id7HGSWz4N6CCUzBa7hPPPKUMC6LFyDX3mrUv16ypKpnjexO1HgKGUB238jrelaNRs+b5e+/XjZ2J3BXpyZG7FEGfLEdiEyC2KyelpEgcpbAEfJg2EkBbkF1AYpVLYBrmFv8bVg+eVUp8trcRyHTtC7QexSL7dMuLTr8jngo8Vq5FT6AZvzu87qfpH+THvGRaxcnRaBUvURGR8Bb75hZyT1rwrV//LeHv8ArnnUKXqC3N+P5gW8x2zqvfKlUS8U3/BcYGw49iRhNfWhrNc+NDbHh+8uSB/G9ZAM0a7Bjw9oIgZKiLupfnDNex4rB4WWQ/IQB9QFkkHb62tFt14/8CoSr9wDEAMOYnEtaJJjoCe+jrwuL5z54eyecFgmeg002cswn5flivAB9nIxX2C76wDSq1mfIfl3/XOjZNnilw4o+O03eGlBy5O1GyhtES+dChvHFb4GrX43Nt2GXyHtR/SchuRsJfr7VGleO4y7nWT9NDRWYY22atrZ9TamAwlpkiDgA3QuXLzJHkw4tx4SVbfQIGF9XUUOKZCWM73ipP4kSA/4Z6d9fBR8+UlrIlsZRUqjaZ+tMP6dz0 WvewJ1Y7 Beo9IF1fsgO54OT8XyroqScGEfFJ1vDFm8e87kQ9GGAu50ciRi9ZZC0YwworykpCscOCYVphXjYT3efQ4Kb+CwnAuQxBZmhrr+vmGRsY6JybyLN+PQawAEjdlMrOPvaU4lya4feCRz8JL+XQOduR+2KIN1iqqXN9uCojkafcyphySR50QoEeXzWMgNVAR0TCWV7WtSGrxPP8zDquDtzPJmvnPfYffVJpaX/Nd2kefAgKjJp3sqz2eX9Rhe/Ip6QzC/MICZlLZB8pk770= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: We currently initialize the memmap such that PG_reserved is set and the refcount of the page is 1. In virtio-mem code, we have to manually clear that PG_reserved flag to make memory offlining with partially hotplugged memory blocks possible: has_unmovable_pages() would otherwise bail out on such pages. We want to avoid PG_reserved where possible and move to typed pages instead. Further, we want to further enlighten memory offlining code about PG_offline: offline pages in an online memory section. One example is handling managed page count adjustments in a cleaner way during memory offlining. So let's initialize the pages with PG_offline instead of PG_reserved. generic_online_page()->__free_pages_core() will now clear that flag before handing that memory to the buddy. Note that the page refcount is still 1 and would forbid offlining of such memory except when special care is take during GOING_OFFLINE as currently only implemented by virtio-mem. With this change, we can now get non-PageReserved() pages in the XEN balloon list. From what I can tell, that can already happen via decrease_reservation(), so that should be fine. HV-balloon should not really observe a change: partial online memory blocks still cannot get surprise-offlined, because the refcount of these PageOffline() pages is 1. Update virtio-mem, HV-balloon and XEN-balloon code to be aware that hotplugged pages are now PageOffline() instead of PageReserved() before they are handed over to the buddy. We'll leave the ZONE_DEVICE case alone for now. Signed-off-by: David Hildenbrand Acked-by: Oscar Salvador # for the generic --- drivers/hv/hv_balloon.c | 5 ++--- drivers/virtio/virtio_mem.c | 18 ++++++++++++------ drivers/xen/balloon.c | 9 +++++++-- include/linux/page-flags.h | 12 +++++------- mm/memory_hotplug.c | 16 ++++++++++------ mm/mm_init.c | 10 ++++++++-- mm/page_alloc.c | 32 +++++++++++++++++++++++--------- 7 files changed, 67 insertions(+), 35 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index e000fa3b9f978..c1be38edd8361 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -693,9 +693,8 @@ static void hv_page_online_one(struct hv_hotadd_state *has, struct page *pg) if (!PageOffline(pg)) __SetPageOffline(pg); return; - } - if (PageOffline(pg)) - __ClearPageOffline(pg); + } else if (!PageOffline(pg)) + return; /* This frame is currently backed; online the page. */ generic_online_page(pg, 0); diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index a3857bacc8446..b90df29621c81 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -1146,12 +1146,16 @@ static void virtio_mem_set_fake_offline(unsigned long pfn, for (; nr_pages--; pfn++) { struct page *page = pfn_to_page(pfn); - __SetPageOffline(page); - if (!onlined) { + if (!onlined) + /* + * Pages that have not been onlined yet were initialized + * to PageOffline(). Remember that we have to route them + * through generic_online_page(). + */ SetPageDirty(page); - /* FIXME: remove after cleanups */ - ClearPageReserved(page); - } + else + __SetPageOffline(page); + VM_WARN_ON_ONCE(!PageOffline(page)); } page_offline_end(); } @@ -1166,9 +1170,11 @@ static void virtio_mem_clear_fake_offline(unsigned long pfn, for (; nr_pages--; pfn++) { struct page *page = pfn_to_page(pfn); - __ClearPageOffline(page); if (!onlined) + /* generic_online_page() will clear PageOffline(). */ ClearPageDirty(page); + else + __ClearPageOffline(page); } } diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index aaf2514fcfa46..528395133b4f8 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -146,7 +146,8 @@ static DECLARE_WAIT_QUEUE_HEAD(balloon_wq); /* balloon_append: add the given page to the balloon. */ static void balloon_append(struct page *page) { - __SetPageOffline(page); + if (!PageOffline(page)) + __SetPageOffline(page); /* Lowmem is re-populated first, so highmem pages go at list tail. */ if (PageHighMem(page)) { @@ -412,7 +413,11 @@ static enum bp_state increase_reservation(unsigned long nr_pages) xenmem_reservation_va_mapping_update(1, &page, &frame_list[i]); - /* Relinquish the page back to the allocator. */ + /* + * Relinquish the page back to the allocator. Note that + * some pages, including ones added via xen_online_page(), might + * not be marked reserved; free_reserved_page() will handle that. + */ free_reserved_page(page); } diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index f04fea86324d9..e0362ce7fc109 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -30,16 +30,11 @@ * - Pages falling into physical memory gaps - not IORESOURCE_SYSRAM. Trying * to read/write these pages might end badly. Don't touch! * - The zero page(s) - * - Pages not added to the page allocator when onlining a section because - * they were excluded via the online_page_callback() or because they are - * PG_hwpoison. * - Pages allocated in the context of kexec/kdump (loaded kernel image, * control pages, vmcoreinfo) * - MMIO/DMA pages. Some architectures don't allow to ioremap pages that are * not marked PG_reserved (as they might be in use by somebody else who does * not respect the caching strategy). - * - Pages part of an offline section (struct pages of offline sections should - * not be trusted as they will be initialized when first onlined). * - MCA pages on ia64 * - Pages holding CPU notes for POWER Firmware Assisted Dump * - Device memory (e.g. PMEM, DAX, HMM) @@ -1021,6 +1016,10 @@ PAGE_TYPE_OPS(Buddy, buddy, buddy) * The content of these pages is effectively stale. Such pages should not * be touched (read/write/dump/save) except by their owner. * + * When a memory block gets onlined, all pages are initialized with a + * refcount of 1 and PageOffline(). generic_online_page() will + * take care of clearing PageOffline(). + * * If a driver wants to allow to offline unmovable PageOffline() pages without * putting them back to the buddy, it can do so via the memory notifier by * decrementing the reference count in MEM_GOING_OFFLINE and incrementing the @@ -1028,8 +1027,7 @@ PAGE_TYPE_OPS(Buddy, buddy, buddy) * pages (now with a reference count of zero) are treated like free pages, * allowing the containing memory block to get offlined. A driver that * relies on this feature is aware that re-onlining the memory block will - * require to re-set the pages PageOffline() and not giving them to the - * buddy via online_page_callback_t. + * require not giving them to the buddy via generic_online_page(). * * There are drivers that mark a page PageOffline() and expect there won't be * any further access to page content. PFN walkers that read content of random diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 27e3be75edcf7..0254059efcbe1 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -734,7 +734,7 @@ static inline void section_taint_zone_device(unsigned long pfn) /* * Associate the pfn range with the given zone, initializing the memmaps * and resizing the pgdat/zone data to span the added pages. After this - * call, all affected pages are PG_reserved. + * call, all affected pages are PageOffline(). * * All aligned pageblocks are initialized to the specified migratetype * (usually MIGRATE_MOVABLE). Besides setting the migratetype, no related @@ -1100,8 +1100,12 @@ int mhp_init_memmap_on_memory(unsigned long pfn, unsigned long nr_pages, move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_UNMOVABLE); - for (i = 0; i < nr_pages; i++) - SetPageVmemmapSelfHosted(pfn_to_page(pfn + i)); + for (i = 0; i < nr_pages; i++) { + struct page *page = pfn_to_page(pfn + i); + + __ClearPageOffline(page); + SetPageVmemmapSelfHosted(page); + } /* * It might be that the vmemmap_pages fully span sections. If that is @@ -1959,9 +1963,9 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages, * Don't allow to offline memory blocks that contain holes. * Consequently, memory blocks with holes can never get onlined * via the hotplug path - online_pages() - as hotplugged memory has - * no holes. This way, we e.g., don't have to worry about marking - * memory holes PG_reserved, don't need pfn_valid() checks, and can - * avoid using walk_system_ram_range() later. + * no holes. This way, we don't have to worry about memory holes, + * don't need pfn_valid() checks, and can avoid using + * walk_system_ram_range() later. */ walk_system_ram_range(start_pfn, nr_pages, &system_ram_pages, count_system_ram_pages_cb); diff --git a/mm/mm_init.c b/mm/mm_init.c index feb5b6e8c8875..c066c1c474837 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -892,8 +892,14 @@ void __meminit memmap_init_range(unsigned long size, int nid, unsigned long zone page = pfn_to_page(pfn); __init_single_page(page, pfn, zone, nid); - if (context == MEMINIT_HOTPLUG) - __SetPageReserved(page); + if (context == MEMINIT_HOTPLUG) { +#ifdef CONFIG_ZONE_DEVICE + if (zone == ZONE_DEVICE) + __SetPageReserved(page); + else +#endif + __SetPageOffline(page); + } /* * Usually, we want to mark the pageblock MIGRATE_MOVABLE, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e0c8a8354be36..039bc52cc9091 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1225,18 +1225,23 @@ void __free_pages_core(struct page *page, unsigned int order, * When initializing the memmap, __init_single_page() sets the refcount * of all pages to 1 ("allocated"/"not free"). We have to set the * refcount of all involved pages to 0. + * + * Note that hotplugged memory pages are initialized to PageOffline(). + * Pages freed from memblock might be marked as reserved. */ - prefetchw(p); - for (loop = 0; loop < (nr_pages - 1); loop++, p++) { - prefetchw(p + 1); - __ClearPageReserved(p); - set_page_count(p, 0); - } - __ClearPageReserved(p); - set_page_count(p, 0); - if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG) && unlikely(context == MEMINIT_HOTPLUG)) { + prefetchw(p); + for (loop = 0; loop < (nr_pages - 1); loop++, p++) { + prefetchw(p + 1); + VM_WARN_ON_ONCE(PageReserved(p)); + __ClearPageOffline(p); + set_page_count(p, 0); + } + VM_WARN_ON_ONCE(PageReserved(p)); + __ClearPageOffline(p); + set_page_count(p, 0); + /* * Freeing the page with debug_pagealloc enabled will try to * unmap it; some archs don't like double-unmappings, so @@ -1245,6 +1250,15 @@ void __free_pages_core(struct page *page, unsigned int order, debug_pagealloc_map_pages(page, nr_pages); adjust_managed_page_count(page, nr_pages); } else { + prefetchw(p); + for (loop = 0; loop < (nr_pages - 1); loop++, p++) { + prefetchw(p + 1); + __ClearPageReserved(p); + set_page_count(p, 0); + } + __ClearPageReserved(p); + set_page_count(p, 0); + /* memblock adjusts totalram_pages() ahead of time. */ atomic_long_add(nr_pages, &page_zone(page)->managed_pages); } From patchwork Fri Jun 7 09:09:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13689522 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96ACCC27C53 for ; Fri, 7 Jun 2024 09:10:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2F6536B00B6; Fri, 7 Jun 2024 05:10:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 232C26B00B7; Fri, 7 Jun 2024 05:10:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0AB5C6B00B8; Fri, 7 Jun 2024 05:10:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id D7B1B6B00B6 for ; Fri, 7 Jun 2024 05:10:07 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 945A5160A64 for ; Fri, 7 Jun 2024 09:10:07 +0000 (UTC) X-FDA: 82203520854.21.01EA736 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf12.hostedemail.com (Postfix) with ESMTP id E4BCD40016 for ; Fri, 7 Jun 2024 09:10:05 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="EyM1/bpm"; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf12.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717751405; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BkvZtA0DZCTst0gmiY3p4tzfSSSAg0RWHnN+gN89loE=; b=d6M41FhHZFW/pX2qItwIEHuaRI9Tk+fpQb8v/WcS2NtSbU0cTqQd0x1dJ6Vqo4I6VgjvqJ /nuwTi+vO1Nkek9tVJS/2qm91UhVBQoJgr1/I+uL6035EWajICPs9TokqY1t0i9J+1lKpV jz6+t6BYrsBBQUW3dvpagf7FmExIYTY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717751406; a=rsa-sha256; cv=none; b=tps56jtZU+q0NvN6gk0bxrT4Je4VAmcVHeXbFTZ+3xy57pzcaC3Q3wLpZj0RVZ/6wqrOsB ew75kqKIYnRYtgMI2YmGUGmTvZU9eTMb1IjO3lyvBS9mfg8mim23BkA2HWeA0MQGUWJ9Yg V0poK22DXBumBE/tZzYcOKMhkdzY7DU= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="EyM1/bpm"; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf12.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1717751405; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BkvZtA0DZCTst0gmiY3p4tzfSSSAg0RWHnN+gN89loE=; b=EyM1/bpmKug/pD2Sj/tnYMQ7j3JBCoiij7zJQVlBLk5ZKyFs3N06CQjm/Wx9BzBZyn8kz+ GHyQ8SbTwQKMNEUNEIwuG+h+WSPXvS0jwc93rSiHH//1y5dpwTPqFcPJRHKmnrkRAXGBvT TUO7dtH5cumVnzQieZ0Xk34i9K0fQDI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-592-yH10sX6lNoa9rUALrYBFDg-1; Fri, 07 Jun 2024 05:10:01 -0400 X-MC-Unique: yH10sX6lNoa9rUALrYBFDg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 16217811E81; Fri, 7 Jun 2024 09:10:00 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.194.94]) by smtp.corp.redhat.com (Postfix) with ESMTP id CF51437E7; Fri, 7 Jun 2024 09:09:55 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-hyperv@vger.kernel.org, virtualization@lists.linux.dev, xen-devel@lists.xenproject.org, kasan-dev@googlegroups.com, David Hildenbrand , Andrew Morton , Mike Rapoport , Oscar Salvador , "K. Y. Srinivasan" , Haiyang Zhang , Wei Liu , Dexuan Cui , "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , =?utf-8?q?Eugenio_P=C3=A9rez?= , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko , Alexander Potapenko , Marco Elver , Dmitry Vyukov Subject: [PATCH v1 3/3] mm/memory_hotplug: skip adjust_managed_page_count() for PageOffline() pages when offlining Date: Fri, 7 Jun 2024 11:09:38 +0200 Message-ID: <20240607090939.89524-4-david@redhat.com> In-Reply-To: <20240607090939.89524-1-david@redhat.com> References: <20240607090939.89524-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 X-Stat-Signature: hg4tdid99emcs7dsiwe1yfhqjbfo5bc5 X-Rspamd-Queue-Id: E4BCD40016 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1717751405-259566 X-HE-Meta: U2FsdGVkX1/f/Z/tNVCLpXpq26ArZbb6l9KTljCN3LvToDhRZp/lKYiKzNyVqhz3+h8fKC/utMQ9p/L3X/3GcMTxtkzAeG7+e/cT9FQBVIk0zp8pElCDhr/tjtlGX+IVSux8821ydrNt7Nm4FjUGidWLsbAKBOaphS3WWp3/qtmn7jAMBJ1OkhwWQe9Dg2F1Q8tkwuFsyuUw6nfY1hAPbA6aWUF8bD9gNdgwFsaC/GtNAtr17ISljHvEs/ic64czUC6G0SuI57Absn6Cvwi8i9ioczfRqwOckCgkfpLpnZWAlYDPywYA6r1EeY+cNcI1jwRb91oMv9IidIGp2CBtuCxzFWjKcG+RuoMPgVqXYvk8gMMO7JN9L6U33yg6WkXHNQK1LqfIAaabNLjDOfNC+27LpluJELTahyjsBU4QJPV/yTSuViAavToxhIjdwkfkneVyfBFVmoGswkjS9S4B3dgZ7Wt1R7QK/unPSVK+bTHxU2xhMufB5zqOWRLgP7EhgDA/IjYtgkQqK6qRxR04HVH21PsAKd9rRImriOkDvyRlKW8FQJSY7uuKcddlIl2zS/G8nRKUOSTews5C3CJ77GZOYNV5VRWv8ExnSoAIGxGx6SIuzf2oDpPmvQhYaqLWP9trWpp9iq2t8NcC6iqF2Rl9R2docZCnLaFgy0C8EzKOwDavCVNBD0Mht7kb5yQbaZoMB9tw6sgV/fTT9e5CFlf33bbhWPGaZsm836od3rLyRoK6uHqvJexvAETr4J+Pc8RZQ+BFN0AN3Jwkg7YCxhzBqW+1nclFZLoUhtK6jlwWP6IPwN18MQOZ6MJ2HPamdybTDRtpDjPB9J2lws5eDraPZSuXFS/JOT+4U4GYmwK3u0H15HFxTNWZa5SdB73GjtABdP5epgw5Pwb+fqDPQ58KyQ+NHErDrqCEbaSSE/W8Mldcf4ssY4UN3m8eeM7+69sg0p1LZ4PnVP55TLk XmSaGp3B AX1Zer7Y6W81rMwl8GStCKx3++85xyo84RPKS38knbRG3g/ozQurArQ976O4zSpcY/6BA5sgOwkfoXjkVGNySCA2RlpN9RTv/KVvHHfqW6wqzKA8EdtNvmPdenz0JYn+eyJFEOcukKdk7hvWYnELrIb/uBqz8yTzch93/PkmAvwBlTyr6ce8P2i10rSrRZfZtsyqgMLtAJDcYDBnKzL7omBvikn47Q+5QKShq X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: We currently have a hack for virtio-mem in place to handle memory offlining with PageOffline pages for which we already adjusted the managed page count. Let's enlighten memory offlining code so we can get rid of that hack, and document the situation. Signed-off-by: David Hildenbrand Acked-by: Oscar Salvador --- drivers/virtio/virtio_mem.c | 11 ++--------- include/linux/memory_hotplug.h | 4 ++-- include/linux/page-flags.h | 8 ++++++-- mm/memory_hotplug.c | 6 +++--- mm/page_alloc.c | 12 ++++++++++-- 5 files changed, 23 insertions(+), 18 deletions(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index b90df29621c81..b0b8714415783 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -1269,12 +1269,6 @@ static void virtio_mem_fake_offline_going_offline(unsigned long pfn, struct page *page; unsigned long i; - /* - * Drop our reference to the pages so the memory can get offlined - * and add the unplugged pages to the managed page counters (so - * offlining code can correctly subtract them again). - */ - adjust_managed_page_count(pfn_to_page(pfn), nr_pages); /* Drop our reference to the pages so the memory can get offlined. */ for (i = 0; i < nr_pages; i++) { page = pfn_to_page(pfn + i); @@ -1293,10 +1287,9 @@ static void virtio_mem_fake_offline_cancel_offline(unsigned long pfn, unsigned long i; /* - * Get the reference we dropped when going offline and subtract the - * unplugged pages from the managed page counters. + * Get the reference again that we dropped via page_ref_dec_and_test() + * when going offline. */ - adjust_managed_page_count(pfn_to_page(pfn), -nr_pages); for (i = 0; i < nr_pages; i++) page_ref_inc(pfn_to_page(pfn + i)); } diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 7a9ff464608d7..ebe876930e782 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -175,8 +175,8 @@ extern int mhp_init_memmap_on_memory(unsigned long pfn, unsigned long nr_pages, extern void mhp_deinit_memmap_on_memory(unsigned long pfn, unsigned long nr_pages); extern int online_pages(unsigned long pfn, unsigned long nr_pages, struct zone *zone, struct memory_group *group); -extern void __offline_isolated_pages(unsigned long start_pfn, - unsigned long end_pfn); +extern unsigned long __offline_isolated_pages(unsigned long start_pfn, + unsigned long end_pfn); typedef void (*online_page_callback_t)(struct page *page, unsigned int order); diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index e0362ce7fc109..0876aca0833e7 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -1024,11 +1024,15 @@ PAGE_TYPE_OPS(Buddy, buddy, buddy) * putting them back to the buddy, it can do so via the memory notifier by * decrementing the reference count in MEM_GOING_OFFLINE and incrementing the * reference count in MEM_CANCEL_OFFLINE. When offlining, the PageOffline() - * pages (now with a reference count of zero) are treated like free pages, - * allowing the containing memory block to get offlined. A driver that + * pages (now with a reference count of zero) are treated like free (unmanaged) + * pages, allowing the containing memory block to get offlined. A driver that * relies on this feature is aware that re-onlining the memory block will * require not giving them to the buddy via generic_online_page(). * + * Memory offlining code will not adjust the managed page count for any + * PageOffline() pages, treating them like they were never exposed to the + * buddy using generic_online_page(). + * * There are drivers that mark a page PageOffline() and expect there won't be * any further access to page content. PFN walkers that read content of random * pages should check PageOffline() and synchronize with such drivers using diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 0254059efcbe1..965707a02556f 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1941,7 +1941,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages, struct zone *zone, struct memory_group *group) { const unsigned long end_pfn = start_pfn + nr_pages; - unsigned long pfn, system_ram_pages = 0; + unsigned long pfn, managed_pages, system_ram_pages = 0; const int node = zone_to_nid(zone); unsigned long flags; struct memory_notify arg; @@ -2062,7 +2062,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages, } while (ret); /* Mark all sections offline and remove free pages from the buddy. */ - __offline_isolated_pages(start_pfn, end_pfn); + managed_pages = __offline_isolated_pages(start_pfn, end_pfn); pr_debug("Offlined Pages %ld\n", nr_pages); /* @@ -2078,7 +2078,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages, zone_pcp_enable(zone); /* removal success */ - adjust_managed_page_count(pfn_to_page(start_pfn), -nr_pages); + adjust_managed_page_count(pfn_to_page(start_pfn), -managed_pages); adjust_present_page_count(pfn_to_page(start_pfn), group, -nr_pages); /* reinitialise watermarks and update pcp limits */ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 039bc52cc9091..809bc4a816e85 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6745,14 +6745,19 @@ void zone_pcp_reset(struct zone *zone) /* * All pages in the range must be in a single zone, must not contain holes, * must span full sections, and must be isolated before calling this function. + * + * Returns the number of managed (non-PageOffline()) pages in the range: the + * number of pages for which memory offlining code must adjust managed page + * counters using adjust_managed_page_count(). */ -void __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn) +unsigned long __offline_isolated_pages(unsigned long start_pfn, + unsigned long end_pfn) { + unsigned long already_offline = 0, flags; unsigned long pfn = start_pfn; struct page *page; struct zone *zone; unsigned int order; - unsigned long flags; offline_mem_sections(pfn, end_pfn); zone = page_zone(pfn_to_page(pfn)); @@ -6774,6 +6779,7 @@ void __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn) if (PageOffline(page)) { BUG_ON(page_count(page)); BUG_ON(PageBuddy(page)); + already_offline++; pfn++; continue; } @@ -6786,6 +6792,8 @@ void __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn) pfn += (1 << order); } spin_unlock_irqrestore(&zone->lock, flags); + + return end_pfn - start_pfn - already_offline; } #endif