From patchwork Mon Mar 10 12:03:16 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrey Ryabinin X-Patchwork-Id: 14009708 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D531FC282DE for ; Mon, 10 Mar 2025 12:04:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9AB57280008; Mon, 10 Mar 2025 08:04:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 95A49280001; Mon, 10 Mar 2025 08:04:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 82409280008; Mon, 10 Mar 2025 08:04:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 60510280001 for ; Mon, 10 Mar 2025 08:04:14 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 2BCFC1C9ED9 for ; Mon, 10 Mar 2025 12:04:16 +0000 (UTC) X-FDA: 83205508512.15.208AD47 Received: from forwardcorp1a.mail.yandex.net (forwardcorp1a.mail.yandex.net [178.154.239.72]) by imf05.hostedemail.com (Postfix) with ESMTP id 416C4100016 for ; Mon, 10 Mar 2025 12:04:13 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=yandex-team.com header.s=default header.b=tKVDYRr1; dmarc=pass (policy=none) header.from=yandex-team.com; spf=pass (imf05.hostedemail.com: domain of arbn@yandex-team.com designates 178.154.239.72 as permitted sender) smtp.mailfrom=arbn@yandex-team.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741608254; a=rsa-sha256; cv=none; b=f6sCCBYK0Oro8U3fr52Nsjz74/PPKFcOb4Qlh+tnFPEqg0+DLbji8DvuUWjU1wPVP7VJxC nKGSWynFgYifz2q6yWeyfH1zeWzyW8Y6/7LjEpPPigXGg3fA2py0qNrYp2u5l13ylBLHEg G38bu6AMdiCw0vZ1VT3ky2Vs60+s3s8= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=yandex-team.com header.s=default header.b=tKVDYRr1; dmarc=pass (policy=none) header.from=yandex-team.com; spf=pass (imf05.hostedemail.com: domain of arbn@yandex-team.com designates 178.154.239.72 as permitted sender) smtp.mailfrom=arbn@yandex-team.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741608254; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9sCYhPB2JXvpHRR7mGAnOD5wFhzVOzlRXeW9jewkrPs=; b=t9xZgG7YA/PTpBw0y0yL3+6XLNhy81ESVB1bcB2c6t8NFWggRp7fZujWdnjoF3pHVEvEmO OsEVBQnkb9Mgo9+dCzmBBjERJKSQQukri2r4SDuCifmkjjsfzKRwRQSTxGL2rpKGt+Sye+ o7uEWUZ/7WhK1kjSAaV57AeqLX73jTE= Received: from mail-nwsmtp-smtp-corp-main-83.vla.yp-c.yandex.net (mail-nwsmtp-smtp-corp-main-83.vla.yp-c.yandex.net [IPv6:2a02:6b8:c1f:600c:0:640:a431:0]) by forwardcorp1a.mail.yandex.net (Yandex) with ESMTPS id D8CAE60EAF; Mon, 10 Mar 2025 15:04:12 +0300 (MSK) Received: from dellarbn.yandex.net (unknown [10.214.35.248]) by mail-nwsmtp-smtp-corp-main-83.vla.yp-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id s3o0lL2FT0U0-nBaQjWYT; Mon, 10 Mar 2025 15:04:12 +0300 X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.com; s=default; t=1741608252; bh=9sCYhPB2JXvpHRR7mGAnOD5wFhzVOzlRXeW9jewkrPs=; h=Message-ID:Date:In-Reply-To:Cc:Subject:References:To:From; b=tKVDYRr1Fc9PZMG5kcffLo5ML52XYvjB9fT2RP6/B4ZrI68qC5XErlG3jfkkQS0xQ LrPykZVjqAxqEQcIvjW8Q1fYRGUY3Nxe3UrpdFO1uG7CnyBvoqCzCxivetoS5oBrG4 Z3/CxdprsbnC7nsU6PIIrdFT3yANX6Z3kGq3BFqU= From: Andrey Ryabinin To: linux-kernel@vger.kernel.org Cc: Alexander Graf , James Gowans , Mike Rapoport , Andrew Morton , linux-mm@kvack.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H . Peter Anvin" , Eric Biederman , kexec@lists.infradead.org, Pratyush Yadav , Jason Gunthorpe , Pasha Tatashin , David Rientjes , Andrey Ryabinin Subject: [PATCH v2 5/7] x86, kstate: Add the ability to preserve memory pages across kexec. Date: Mon, 10 Mar 2025 13:03:16 +0100 Message-ID: <20250310120318.2124-6-arbn@yandex-team.com> X-Mailer: git-send-email 2.45.3 In-Reply-To: <20250310120318.2124-1-arbn@yandex-team.com> References: <20250310120318.2124-1-arbn@yandex-team.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 416C4100016 X-Rspamd-Server: rspam11 X-Stat-Signature: z6w6aa1twq3hnq9knw9onzfsffgkopgd X-Rspam-User: X-HE-Tag: 1741608253-406666 X-HE-Meta: U2FsdGVkX19nZ2F+/KrJZAI3pH8FpxRHmwd53LKDpZm63jXJwm/KeDS6154xAayXzQ2o0wi/jEYuDoAXnDar8Zd/TMAcDyoB9Z0DoVP2C6lde/MX9Sa6l9q6DlDOAFkPcp1OvEq+2QtoO+Hd7P5kFV0GYlrrQWG59dedbcA4Lot9kA+Oxb1/5RVL936ign3RKLQa4XbAxdLBNyo2Z282XnBxJgRavNUG/Igt9U4p5d7DfS8vJc5UneJ5fYw6qXmIWgN8K0bbKdgAkrfrZ8SU5as7CVATOmZpmUq+I192e8MkXqgRm+5f6fYTHJkfpuTcEcRqlx8T6gRGl6tzWQoU7lxIhdjTrGUoeqyPDG03gwgqeco3nwOCpmJhxufmJAjf4qBRLM+2z93W7ig/hd7RDFVbBSBrSnvFmWKiQ6qdbFo6m/BBuk2e2lOFq2ojP+iB+XaWqe2LPSuC8/sjpeB8vZQHrhcOG5Mf6VHd03lF/AsZh9nOZbHypnagmgztcG17/imWcpuFROMYM3KK152+fgBaW6nNM47jUuaBbleFWPcD/GaYgFw3p4E3MwJCyaOn//G/Qf4AkBDQKPL6gkcFOWKmdWH0fQr5PxSy/IEjfX+n0a28iE3+ILCL5UBRVSMnAFbWvQnGFOqnYWu3d4/EYfnF1i7gREqtGlisOBV+6/vatjZGeT+gMr8xkwz4i0O7xl5We2bte/5jfdUTR9wnc7ceJuQN1K0siDtH72SiM8siXFpwFoeAyG4H3p1AWe2aglEpyFxqi8vJCQXQ1UgcuCWM4b/Pm+gVlDlj0UXnYz8CRqJYVcfJ4ELwuSrMulyxHTK75siBVSRGIIWXfhBdAiu+mifYM+bbecLyy2PlEv3jNp1YZe/sqSS6NQfGMmabUW02TqyvBBnng9o+/AtEDGT3+jHY8FeTK3XpIhv8XMqCRlEG5rEMsvofhGlAuXZNmHEJiEdmC8iBd4Dz7F9 YCa3KWTp VI4rQ8K6u1n2e+uxcWNiZXsKmkWVStKvPTHgrctPZx2f1hLxsnuOspeEUMAzL2D8EJ73Nvz5kKnEg2OX6TEoyTcMKZg6CMFYQLFAKVILh7DvENMwVQX6QaOIfPHwcafwZ7rRGhZQB3kvrTZcDTiRZzhxYg5fPsedYMjVIN0R4k/JzfkJ/GamZ7ugttaqEdNPECT0yhDBEoe7Xop0AHZStKiSiDvIvtSqc/9VoEUEDpwRmM07g/bnLsbG3LVoV5Jw+CYwIhzXBU/aibbRp0LJ4fXxqy49I+3Si8DfsoFDAH22AblnUoLMUPlj5qL3vUpHjagRITP/fmDcjwcm/Fg1OHkSXU+ngyaoT1vi3 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This adds ability to specify page of memory that kstate needs to preserve across kexec. kstate_register_page() stores struct page in the special list of 'struct kpage_state's. At kexec reboot stage this list iterated, pfns saved into kstate's data stream. The new kernel after kexec reads pfns from the stream and marks memory as reserved to keep it intact. Signed-off-by: Andrey Ryabinin --- include/linux/kstate.h | 30 ++++++++++ kernel/kexec_core.c | 3 +- kernel/kstate.c | 124 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 156 insertions(+), 1 deletion(-) diff --git a/include/linux/kstate.h b/include/linux/kstate.h index ae583d090111..36cfefd87572 100644 --- a/include/linux/kstate.h +++ b/include/linux/kstate.h @@ -88,6 +88,8 @@ struct kstate_field { }; enum kstate_ids { + KSTATE_RSVD_MEM_ID = 1, + KSTATE_STRUCT_PAGE_ID, KSTATE_LAST_ID = -1, }; @@ -124,6 +126,8 @@ static inline unsigned long kstate_get_ulong(struct kstate_stream *stream) return ret; } +extern struct kstate_description page_state; + #ifdef CONFIG_KSTATE void kstate_init(void); @@ -141,6 +145,12 @@ void restore_kstate(struct kstate_stream *stream, int id, const struct kstate_description *kstate, void *obj); int kstate_load_migrate_buf(struct kimage *image); +int kstate_page_save(struct kstate_stream *stream, void *obj, + const struct kstate_field *field); +int kstate_register_page(struct page *page, int order); + +bool kstate_range_is_preserved(unsigned long start, unsigned long end); + #else static inline void kstate_init(void) { } @@ -150,6 +160,11 @@ static inline int kstate_save_state(void) { return 0; } static inline void free_kstate_stream(void) { } static inline int kstate_load_migrate_buf(struct kimage *image) { return 0; } + +static inline bool kstate_range_is_preserved(unsigned long start, + unsigned long end) +{ return 0; } + #endif @@ -176,6 +191,21 @@ static inline int kstate_load_migrate_buf(struct kimage *image) { return 0; } .offset = offsetof(_state, _f), \ } +#define KSTATE_PAGE(_f, _state) \ + { \ + .name = "page", \ + .flags = KS_CUSTOM, \ + .offset = offsetof(_state, _f), \ + .save = kstate_page_save, \ + }, \ + KSTATE_ADDRESS(_f, _state, KS_VMEMMAP_ADDR), \ + { \ + .name = "struct_page", \ + .flags = KS_STRUCT | KS_POINTER, \ + .offset = offsetof(_state, _f), \ + .ksd = &page_state, \ + } + #define KSTATE_END_OF_LIST() { \ .flags = KS_END,\ } diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c index 7c79addeb93b..5d001b7a9e44 100644 --- a/kernel/kexec_core.c +++ b/kernel/kexec_core.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -261,7 +262,7 @@ int kimage_is_destination_range(struct kimage *image, return 1; } - return 0; + return kstate_range_is_preserved(start, end); } int kimage_is_control_page(struct kimage *image, diff --git a/kernel/kstate.c b/kernel/kstate.c index d35996287b76..68a1272abceb 100644 --- a/kernel/kstate.c +++ b/kernel/kstate.c @@ -309,6 +309,13 @@ int kstate_register(struct kstate_description *state, void *obj) return 0; } +int kstate_page_save(struct kstate_stream *stream, void *obj, + const struct kstate_field *field) +{ + kstate_register_page(*(struct page **)obj, 0); + return 0; +} + static int __init setup_kstate(char *arg) { char *end; @@ -323,7 +330,124 @@ static int __init setup_kstate(char *arg) } early_param("kstate_stream", setup_kstate); +/* + * TODO: probably should use folio instead/in addition, + * also will need to think/decide what fields + * to preserve or not + */ +struct kstate_description page_state = { + .name = "struct_page", + .id = KSTATE_STRUCT_PAGE_ID, + .state_list = LIST_HEAD_INIT(page_state.state_list), + .fields = (const struct kstate_field[]) { + KSTATE_BASE_TYPE(_mapcount, struct page, atomic_t), + KSTATE_BASE_TYPE(_refcount, struct page, atomic_t), + KSTATE_END_OF_LIST() + }, +}; + +struct state_entry preserved_se; + +struct preserved_pages { + unsigned int nr_pages; + struct list_head list; +}; +struct kpage_state { + struct list_head list; + u8 order; + struct page *page; +}; + +struct preserved_pages preserved_pages = { + .list = LIST_HEAD_INIT(preserved_pages.list) +}; + +int kstate_register_page(struct page *page, int order) +{ + struct kpage_state *state; + + state = kmalloc(sizeof(*state), GFP_KERNEL); + if (!state) + return -ENOMEM; + + state->page = page; + state->order = order; + list_add(&state->list, &preserved_pages.list); + preserved_pages.nr_pages++; + return 0; +} + +static int kstate_pages_save(struct kstate_stream *stream, void *obj, + const struct kstate_field *field) +{ + struct kpage_state *p_state; + int ret; + + list_for_each_entry(p_state, &preserved_pages.list, list) { + unsigned long paddr = page_to_phys(p_state->page); + + ret = kstate_save_data(stream, &p_state->order, + sizeof(p_state->order)); + if (ret) + return ret; + ret = kstate_save_data(stream, &paddr, sizeof(paddr)); + if (ret) + return ret; + } + return 0; +} + +bool kstate_range_is_preserved(unsigned long start, unsigned long end) +{ + struct kpage_state *p_state; + + list_for_each_entry(p_state, &preserved_pages.list, list) { + unsigned long pstart, pend; + pstart = page_to_boot_pfn(p_state->page); + pend = pstart + (p_state->order << PAGE_SHIFT) - 1; + if ((end >= pstart) && (start <= pend)) + return 1; + } + return 0; +} + +static int __init kstate_pages_restore(struct kstate_stream *stream, void *obj, + const struct kstate_field *field) +{ + struct preserved_pages *preserved_pages = obj; + int nr_pages, i; + + nr_pages = preserved_pages->nr_pages; + for (i = 0; i < nr_pages; i++) { + int order = kstate_get_byte(stream); + unsigned long phys = kstate_get_ulong(stream); + + memblock_reserve(phys, PAGE_SIZE << order); + } + return 0; +} + +struct kstate_description kstate_preserved_mem = { + .name = "preserved_range", + .id = KSTATE_RSVD_MEM_ID, + .state_list = LIST_HEAD_INIT(kstate_preserved_mem.state_list), + .fields = (const struct kstate_field[]) { + KSTATE_BASE_TYPE(nr_pages, struct preserved_pages, unsigned int), + { + .name = "pages", + .flags = KS_CUSTOM, + .size = sizeof(struct preserved_pages), + .save = kstate_pages_save, + .restore = kstate_pages_restore, + }, + + KSTATE_END_OF_LIST() + }, +}; + void __init kstate_init(void) { memblock_reserve(kstate_stream_addr, kstate_size); + __kstate_register(&kstate_preserved_mem, &preserved_pages, + &preserved_se); }