From patchwork Wed Oct 2 16:07:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrey Ryabinin X-Patchwork-Id: 13820014 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A68ECF6D3B for ; Wed, 2 Oct 2024 16:09:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DEB5F4401BD; Wed, 2 Oct 2024 12:08:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D9AD64401B5; Wed, 2 Oct 2024 12:08:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C14044401BD; Wed, 2 Oct 2024 12:08:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 9D2104401B5 for ; Wed, 2 Oct 2024 12:08:58 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 453A2160AAA for ; Wed, 2 Oct 2024 16:08:58 +0000 (UTC) X-FDA: 82629145956.18.FEF2A4D Received: from forwardcorp1d.mail.yandex.net (forwardcorp1d.mail.yandex.net [178.154.239.200]) by imf12.hostedemail.com (Postfix) with ESMTP id A9A0040018 for ; Wed, 2 Oct 2024 16:08:54 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=yandex-team.com header.s=default header.b="Wl/qGGKH"; dmarc=pass (policy=none) header.from=yandex-team.com; spf=pass (imf12.hostedemail.com: domain of arbn@yandex-team.com designates 178.154.239.200 as permitted sender) smtp.mailfrom=arbn@yandex-team.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727885315; a=rsa-sha256; cv=none; b=SepwgdvStAH85sPH4j3vi2mk1/tOksagGeRggan3cibExjy7cRYpDPtc2CpjaNMEZIsyjR Eoodrp1D1x6NzjTyZQioq26KRGPQoiIbSWJFdmjEUA4T1s0htgVs0v0j14yNxwj8Wf3QFh Jj+tcwlrJ7QXvmb9PHdtBjqDKh8iDSA= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=yandex-team.com header.s=default header.b="Wl/qGGKH"; dmarc=pass (policy=none) header.from=yandex-team.com; spf=pass (imf12.hostedemail.com: domain of arbn@yandex-team.com designates 178.154.239.200 as permitted sender) smtp.mailfrom=arbn@yandex-team.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727885315; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4BDwgTNznoqh19FbW1xbxFVkO1uBqVi3RNsbBRD27Tk=; b=kYchHIe+XTCZ0XRZ2ehka0HRzAAGVBaIsUN0O54HVIooH798wfrMdRpm2wEgGom6jP84rP CzJjejGYwjnRpamxD4kEdLPhItVes9jma1b5WQb/g0fgvGH6TIVlFOuTVslk2faOGWQaMD UuC/1SbSy2pW3bz+TtS+0NNifJeurdo= Received: from mail-nwsmtp-smtp-corp-main-56.klg.yp-c.yandex.net (mail-nwsmtp-smtp-corp-main-56.klg.yp-c.yandex.net [IPv6:2a02:6b8:c42:b1cb:0:640:2a1e:0]) by forwardcorp1d.mail.yandex.net (Yandex) with ESMTPS id 2065060A59; Wed, 2 Oct 2024 19:08:53 +0300 (MSK) Received: from dellarbn.yandex.net (unknown [10.214.35.248]) by mail-nwsmtp-smtp-corp-main-56.klg.yp-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id Z8emWD2IhiE0-Eli2J3P3; Wed, 02 Oct 2024 19:08:51 +0300 X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.com; s=default; t=1727885332; bh=4BDwgTNznoqh19FbW1xbxFVkO1uBqVi3RNsbBRD27Tk=; h=Message-ID:Date:In-Reply-To:Cc:Subject:References:To:From; b=Wl/qGGKHTSFA9wWN1bbDi0YbwEitbGlvPpRRrKsCp6ccao0ouB11BBtvPJ8m+RQU9 S1A+woHBqqO/jmqdf0EgywyOtxMlp2iXTZwnlRl9UdmwlHpLYKiFEQdpmQ5pmgN62h H2VfuCOascBrttGcumZxN4A5bl1+FwHOgUl+Ts/8= From: Andrey Ryabinin To: linux-kernel@vger.kernel.org Cc: Alexander Graf , James Gowans , Mike Rapoport , Andrew Morton , linux-mm@kvack.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Eric Biederman , kexec@lists.infradead.org, Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , linux-trace-kernel@vger.kernel.org, valesini@yandex-team.com, Andrey Ryabinin Subject: [RFC PATCH 1/7] kstate: Add kstate - a mechanism to migrate some kernel state across kexec Date: Wed, 2 Oct 2024 18:07:16 +0200 Message-ID: <20241002160722.20025-2-arbn@yandex-team.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241002160722.20025-1-arbn@yandex-team.com> References: <20241002160722.20025-1-arbn@yandex-team.com> MIME-Version: 1.0 X-Yandex-Filter: 1 X-Rspam-User: X-Stat-Signature: 7ffai3rpnaoszc875n6fnxgemthjt7a1 X-Rspamd-Queue-Id: A9A0040018 X-Rspamd-Server: rspam02 X-HE-Tag: 1727885334-680396 X-HE-Meta: U2FsdGVkX1+5KsTBC7uVTcXiwK8588gBG8ay0URmx6eErqpQ5btNd12N91GoW9QVVkNDLHzDSubVIITjnKL+hUZWkaPI/WVfHG+Mt5c/IxX5QiU517ehlnPSMGZob6GugK9AXJRnvXn043wyQLjudV/pFBVVKrqbImMP9zuqUqx1ag9bLtbfEP4qzbfijgatYBwHfPXLtZLTbGLz2SlI1QTh82In+d01ayd1lL4N5bHK8drwyvdS3vYQKvsAbdEZWtPjX3B/MZiSAQUfwAkSmC1QmfqdJRB1H0awdhl0kB2N7Mo//cTjHC31dHvW68xKundlV8TgVE30uJIp7dO8osyNRzCyXHOsK64XypCIhKXIw2UDa/5NYmjLkvnVEzYifpLCHBY2t+KgY1Fv1HzcqN/d49TrA0OFDOtDZ7C+2P9nSHrcXXUJUWpQjPh8ku1QgMckXWdF1c9dHGGnjU6IWx1Yo3Bb8Y/2qWYVYG1Mybc/rxhXpJi85f6DtlwuejwEVHLnjsAsJ2QQ4ZaxLdwjfzSMBfgKqNbf5COABDj+nXF5FAvhZeFOjBo1CfY9hSJcKou3gxo9A2xN6Eeg8L5F2+O/UZXvYIcAt50MAYjPt3AHq9itE3mqHc6pDtsVph5InU0ovRmyUaA4vuUODyyBbXFtRO5W0E3FbGRuZhzIrM/Lhw8eyuTxbmG5Y6EeEiXKLp2718DDstSkQMp9NWYDF3G+p4LVizyHL1Zah3YfqSrnzSmYXy2/3a+3KpyaYErj9s9U/tYazvgPscJT/mqqXxyFWcdUC7I5B2LRNcTGjdccqGW85mda3gjXMr2JpDPu3jQTAGL4/fbpwHmdr1tA5u85Gi9Q0cRGPSpXQlwf4/10wN17ijL5m2GE4L5T/MyjXpyj08X0Bu/Oe2n9as+AlASwaDnmqpAKoTnnCvWflB2CTPYLxkyStEWdScpjbOFfAkB1Buj7Gvzqx4ZGIJz yxLWZLzu pLCF1FQn+Fvfq6vs6Na0TeuieIqI+ck7d6AB/BNJQUgBI6yaj9vo+2rf2V336V+Mxip02Oq7fYDozZjHMK1PMsp6McNd4ZWhnucbFmNt726kWoX5iCSlYxGSSl9GgWTfp+UmjLKITIbpdyLn93p4fY6PMAfDXWMR5bZdkMvnsWHxvDurwW14FSfxu+h+EhQzW2pyxiv4uCfW8IOVxrsrpW2vSFY4lKVC0a79PSJRDxvt+LnzYlja1fOp9OXD8BUB8TlpIKjiULzc5QCyNVUmjfB8jh1R7aqgi7HM3PI/XqnVZzmPnEykjaicsZlYpN4SQkk+lW6f77xOGO8puhWoABIzOSe/cWsQHaQw/qgiU7v7/UII= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: kstate (kernel state) is a mechanism to describe internal kernel state (partially), save it into the memory and restore the state after kexec in new kernel. The end goal here and the main use case for this is to be able to update host kernel under VMs with VFIO pass-through devices running on that host. We are pretty far from that end goal yet. This and following patches only try to establish some basic infrastructure to describe and migrate complex in-kernel states. And as a demonstration, the state of trace buffer migrated across kexec to new kernel (in the follow up patches). States (usually this is some struct) are described by the 'struct kstate_description' containing the array of individual fields descpriptions - 'struct kstate_field'. Fields have different types like: KS_SIMPLE - trivial type that just copied by value KS_POINTER - field contains pointer, it will be dereferenced to copy the value during save/restore phases. KS_STRUCT - contains another struct, field->ksd must point to another 'struct kstate_dscription' KS_CUSTOM - something that requires fit trivial types as above, for this fields the callbacks field->save()/->restore() must do all job KS_ARRAY_OF_POINTER - array of pointers, the size of array determined by the field->count() callback KS_END - special flag indicating the end of migration stream data. kstate_register() call accepts kstate_description along with an instance of an object and registers it in the global 'states' list. During kexec reboot phase this list iterated, and for each instance in the list 'struct kstate_entry' formed and saved in the migration stream. 'kstate_entry' contains information like ID of kstate_description, version of it, size of migration data and the data itself. After the reboot, when the kstate_register() called it parses migration stream, finds the appropriate 'kstate_entry' and restores the contents of the object. This is an early RFC, so the code is somewhat hacky and some parts of this feature isn't well thought trough yet (like dealing with struct changes between old and new kernel, fixed size of migrate stream memory, and many more). Signed-off-by: Andrey Ryabinin --- include/linux/kstate.h | 118 ++++++++++++++++++++++++ kernel/Kconfig.kexec | 12 +++ kernel/Makefile | 1 + kernel/kstate.c | 198 +++++++++++++++++++++++++++++++++++++++++ 4 files changed, 329 insertions(+) create mode 100644 include/linux/kstate.h create mode 100644 kernel/kstate.c diff --git a/include/linux/kstate.h b/include/linux/kstate.h new file mode 100644 index 0000000000000..c97804d0243ea --- /dev/null +++ b/include/linux/kstate.h @@ -0,0 +1,118 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _KSTATE_H +#define _KSTATE_H + +#include +#include +#include + +struct kstate_description; +enum kstate_flags { + KS_SIMPLE = (1 << 0), + KS_POINTER = (1 << 1), + KS_STRUCT = (1 << 2), + KS_CUSTOM = (1 << 3), + KS_ARRAY_OF_POINTER = (1 << 4), + KS_END = (1UL << 31), +}; + +struct kstate_field { + const char *name; + size_t offset; + size_t size; + enum kstate_flags flags; + const struct kstate_description *ksd; + int version_id; + int (*restore)(void *mig_stream, void *obj, const struct kstate_field *field); + int (*save)(void *mig_stream, void *obj, const struct kstate_field *field); + int (*count)(void); +}; + +enum kstate_ids { + KSTATE_LAST_ID = -1, +}; + +struct kstate_description { + const char *name; + enum kstate_ids id; + atomic_t instance_id; + int version_id; + struct list_head state_list; + + const struct kstate_field *fields; +}; + +struct state_entry { + u64 id; + struct list_head list; + struct kstate_description *kstd; + void *obj; +}; + +static inline bool kstate_get_byte(void **mig_stream) +{ + bool ret = **(u8 **)mig_stream; + (*mig_stream)++; + return ret; +} +static inline void *kstate_save_byte(void *mig_stream, u8 val) +{ + *(u8 *)mig_stream = val; + return mig_stream + sizeof(val); +} + +static inline void *kstate_save_ulong(void *mig_stream, unsigned long val) +{ + *(unsigned long *)mig_stream = val; + return mig_stream + sizeof(val); +} +static inline unsigned long kstate_get_ulong(void **mig_stream) +{ + unsigned long ret = **(unsigned long **)mig_stream; + (*mig_stream) += sizeof(unsigned long); + return ret; +} + +#ifdef CONFIG_KSTATE +bool is_migrate_kernel(void); + +void save_migrate_state(unsigned long mig_stream); + +void __kstate_register(struct kstate_description *state, + void *obj, struct state_entry *se); +int kstate_register(struct kstate_description *state, void *obj); + +struct kstate_entry; +void *save_kstate(void *stream, int id, const struct kstate_description *kstate, + void *obj); +void *restore_kstate(struct kstate_entry *ke, int id, + const struct kstate_description *kstate, void *obj); +#else + +#define __kstate_register(state, obj, se) +#define kstate_register(state, obj) + +static inline void save_migrate_state(unsigned long mig_stream) { } + +#endif + + +#define KSTATE_SIMPLE(_f, _state) { \ + .name = (__stringify(_f)), \ + .size = sizeof_field(_state, _f), \ + .flags = KS_SIMPLE, \ + .offset = offsetof(_state, _f), \ + } + +#define KSTATE_POINTER(_f, _state) { \ + .name = (__stringify(_f)), \ + .size = sizeof(*(((_state *)0)->_f)), \ + .flags = KS_POINTER, \ + .offset = offsetof(_state, _f), \ + } + +#define KSTATE_END_OF_LIST() { \ + .flags = KS_END,\ + } + +#endif diff --git a/kernel/Kconfig.kexec b/kernel/Kconfig.kexec index 6c34e63c88ff4..d8fecf29e384a 100644 --- a/kernel/Kconfig.kexec +++ b/kernel/Kconfig.kexec @@ -151,4 +151,16 @@ config CRASH_MAX_MEMORY_RANGES the computation behind the value provided through the /sys/kernel/crash_elfcorehdr_size attribute. +config KSTATE + bool "Migrate certain internal kernel state across kexec" + default n + depends on CRASH_DUMP + help + Enable functionality to migrate some internal kernel states to new + kernel across kexec. Currently capable only migrating trace buffers + as an example. Can be extended to other states like IOMMU page tables, + VFIO state of the device... + Description of the trace buffer saved into memory preserved across kexec. + The new kernel reads description to restore the state of trace buffers. + endmenu diff --git a/kernel/Makefile b/kernel/Makefile index 87866b037fbed..6bdf947fc84f5 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -75,6 +75,7 @@ obj-$(CONFIG_CRASH_DUMP) += crash_core.o obj-$(CONFIG_KEXEC) += kexec.o obj-$(CONFIG_KEXEC_FILE) += kexec_file.o obj-$(CONFIG_KEXEC_ELF) += kexec_elf.o +obj-$(CONFIG_KSTATE) += kstate.o obj-$(CONFIG_BACKTRACE_SELF_TEST) += backtracetest.o obj-$(CONFIG_COMPAT) += compat.o obj-$(CONFIG_CGROUPS) += cgroup/ diff --git a/kernel/kstate.c b/kernel/kstate.c new file mode 100644 index 0000000000000..0ef228baef94e --- /dev/null +++ b/kernel/kstate.c @@ -0,0 +1,198 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include +#include +#include + +static LIST_HEAD(states); + +struct kstate_entry { + int state_id; + int version_id; + int instance_id; + int size; + DECLARE_FLEX_ARRAY(u8, data); +}; + +void *save_kstate(void *stream, int id, const struct kstate_description *kstate, + void *obj) +{ + const struct kstate_field *field = kstate->fields; + struct kstate_entry *ke = stream; + + stream = ke->data; + + ke->state_id = kstate->id; + ke->version_id = kstate->version_id; + ke->instance_id = id; + + while (field->flags != KS_END) { + void *first, *cur; + int n_elems = 1; + int size, i; + + first = obj + field->offset; + + if (field->flags & KS_POINTER) + first = *(void **)(obj + field->offset); + if (field->count) + n_elems = field->count(); + size = field->size; + for (i = 0; i < n_elems; i++) { + cur = first + i * size; + + if (field->flags & KS_ARRAY_OF_POINTER) + cur = *(void **)cur; + + if (field->flags & KS_STRUCT) + stream = save_kstate(stream, 0, field->ksd, cur); + else if (field->flags & KS_CUSTOM) { + if (field->save) + stream += field->save(stream, cur, field); + } else if (field->flags & (KS_SIMPLE|KS_POINTER)) { + memcpy(stream, cur, size); + stream += size; + } else + WARN_ON_ONCE(1); + + } + field++; + + } + + ke->size = (u8 *)stream - ke->data; + return stream; +} + +void save_migrate_state(unsigned long mig_stream) +{ + struct state_entry *se; + struct kstate_entry *ke; + void *dest; + struct page *page; + + page = boot_pfn_to_page(mig_stream >> PAGE_SHIFT); + arch_kexec_post_alloc_pages(page_address(page), 512, 0); + dest = page_address(page); + list_for_each_entry(se, &states, list) + dest = save_kstate(dest, se->id, se->kstd, se->obj); + ke = dest; + ke->state_id = KSTATE_LAST_ID; +} + +void *restore_kstate(struct kstate_entry *ke, int id, + const struct kstate_description *kstate, void *obj) +{ + const struct kstate_field *field = kstate->fields; + u8 *stream = ke->data; + + WARN_ONCE(ke->version_id != kstate->version_id, "version mismatch %d %d\n", + ke->version_id, kstate->version_id); + + WARN_ONCE(ke->instance_id != id, "instance id mismatch %d %d\n", + ke->instance_id, id); + + while (field->flags != KS_END) { + void *first, *cur; + int n_elems = 1; + int size, i; + + first = obj + field->offset; + if (field->flags & KS_POINTER) + first = *(void **)(obj + field->offset); + if (field->count) + n_elems = field->count(); + size = field->size; + for (i = 0; i < n_elems; i++) { + cur = first + i * size; + + if (field->flags & KS_ARRAY_OF_POINTER) + cur = *(void **)cur; + + if (field->flags & KS_STRUCT) + stream = restore_kstate((struct kstate_entry *)stream, + 0, field->ksd, cur); + else if (field->flags & KS_CUSTOM) { + if (field->restore) + stream += field->restore(stream, cur, field); + } else if (field->flags & (KS_SIMPLE|KS_POINTER)) { + memcpy(cur, stream, size); + stream += size; + } else + WARN_ON_ONCE(1); + + } + field++; + } + + return stream; +} + +static void restore_migrate_state(unsigned long mig_stream, + struct state_entry *se) +{ + char *dest; + struct kstate_entry *ke; + + if (mig_stream == -1) + return; + + dest = phys_to_virt(mig_stream); + ke = (struct kstate_entry *)dest; + while (ke->state_id != KSTATE_LAST_ID) { + if (ke->state_id != se->kstd->id || + ke->instance_id != se->id) { + ke = (struct kstate_entry *)(ke->data + ke->size); + continue; + } + + restore_kstate(ke, se->id, se->kstd, se->obj); + ke = (struct kstate_entry *)(ke->data + ke->size); + } +} + +unsigned long long migrate_stream_addr = -1; +EXPORT_SYMBOL_GPL(migrate_stream_addr); +unsigned long long migrate_stream_size; + +bool is_migrate_kernel(void) +{ + return migrate_stream_addr != -1; +} + +void __kstate_register(struct kstate_description *state, void *obj, struct state_entry *se) +{ + se->kstd = state; + se->id = atomic_inc_return(&state->instance_id); + se->obj = obj; + list_add(&se->list, &states); + restore_migrate_state(migrate_stream_addr, se); +} + +int kstate_register(struct kstate_description *state, void *obj) +{ + struct state_entry *se; + + se = kmalloc(sizeof(*se), GFP_KERNEL); + if (!se) + return -ENOMEM; + + __kstate_register(state, obj, se); + return 0; +} + +static int __init setup_migrate(char *arg) +{ + char *end; + + if (!arg) + return -EINVAL; + migrate_stream_addr = memparse(arg, &end); + if (*end == '@') { + migrate_stream_size = migrate_stream_addr; + migrate_stream_addr = memparse(end + 1, &end); + } + return end > arg ? 0 : -EINVAL; +} +early_param("migrate_stream", setup_migrate); From patchwork Wed Oct 2 16:07:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrey Ryabinin X-Patchwork-Id: 13820015 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 311C4CF6D3C for ; Wed, 2 Oct 2024 16:09:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2C7A84401BE; Wed, 2 Oct 2024 12:09:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 27A2E4401B5; Wed, 2 Oct 2024 12:09:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 11C274401BE; Wed, 2 Oct 2024 12:09:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id D62F54401B5 for ; Wed, 2 Oct 2024 12:08:59 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 6550416098A for ; Wed, 2 Oct 2024 16:08:59 +0000 (UTC) X-FDA: 82629145998.29.A49F1D1 Received: from forwardcorp1d.mail.yandex.net (forwardcorp1d.mail.yandex.net [178.154.239.200]) by imf09.hostedemail.com (Postfix) with ESMTP id 54530140004 for ; Wed, 2 Oct 2024 16:08:57 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=yandex-team.com header.s=default header.b=rISzUCre; spf=pass (imf09.hostedemail.com: domain of arbn@yandex-team.com designates 178.154.239.200 as permitted sender) smtp.mailfrom=arbn@yandex-team.com; dmarc=pass (policy=none) header.from=yandex-team.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727885297; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FEwZkDU2VcUSP99EqMYuNuirJIt2aNspqQO+oi3JBZ0=; b=AYxa5NSbOmZrfM6dwNIMdxA5Vyy5twCw2TFq8pauL6J1pd5I+1ecMqMArrya9tVCkBLpo+ 0ZAXylI3n+QzocOiE2qScWahFMwmh2Bzzw0G7dezPIjM2bX54znkoy1NB3+GX3XMT86RzX vAHGNamzbmYLkcgq/2UKcBPWfIl+Kx8= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=yandex-team.com header.s=default header.b=rISzUCre; spf=pass (imf09.hostedemail.com: domain of arbn@yandex-team.com designates 178.154.239.200 as permitted sender) smtp.mailfrom=arbn@yandex-team.com; dmarc=pass (policy=none) header.from=yandex-team.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727885297; a=rsa-sha256; cv=none; b=8Gs1/DykfWGLg3+CQfyb+aWNb9cnOzSnFWYutwA9ibYuMy3Mkx12pAvUrEsZvnUUaEkjtl 4eqRb4Qy9A6DRNMNlz6a6N0mJP35hcF3bbwfPUyjNZ1iRho5202vC+3AbMnOiq2H+SLNNu 0vrLz6caLG1EtZPXACLF76cppYsCQI8= Received: from mail-nwsmtp-smtp-corp-main-56.klg.yp-c.yandex.net (mail-nwsmtp-smtp-corp-main-56.klg.yp-c.yandex.net [IPv6:2a02:6b8:c42:b1cb:0:640:2a1e:0]) by forwardcorp1d.mail.yandex.net (Yandex) with ESMTPS id C8E5760A6D; Wed, 2 Oct 2024 19:08:55 +0300 (MSK) Received: from dellarbn.yandex.net (unknown [10.214.35.248]) by mail-nwsmtp-smtp-corp-main-56.klg.yp-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id Z8emWD2IhiE0-NNW0FOgv; Wed, 02 Oct 2024 19:08:54 +0300 X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.com; s=default; t=1727885334; bh=FEwZkDU2VcUSP99EqMYuNuirJIt2aNspqQO+oi3JBZ0=; h=Message-ID:Date:In-Reply-To:Cc:Subject:References:To:From; b=rISzUCreiYPr5KO7sSHN36OWn/OE/1SW6J4qWdFjSKXFl/OmP3/AsvrO93oc/rfO3 E5TEW/kK1jsj+bzA3aigLi1L2TRjEmagB2KlDbxL85Ej/R71PObcT7q7wfbME/qBX8 269sah/PVLakog1l83ZOSHYhyBL6jgF+dNfxMlRE= From: Andrey Ryabinin To: linux-kernel@vger.kernel.org Cc: Alexander Graf , James Gowans , Mike Rapoport , Andrew Morton , linux-mm@kvack.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Eric Biederman , kexec@lists.infradead.org, Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , linux-trace-kernel@vger.kernel.org, valesini@yandex-team.com, Andrey Ryabinin Subject: [RFC PATCH 2/7] kexec: Hack and abuse crashkernel for the kstate's migration stream Date: Wed, 2 Oct 2024 18:07:17 +0200 Message-ID: <20241002160722.20025-3-arbn@yandex-team.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241002160722.20025-1-arbn@yandex-team.com> References: <20241002160722.20025-1-arbn@yandex-team.com> MIME-Version: 1.0 X-Yandex-Filter: 1 X-Rspam-User: X-Stat-Signature: s59eeitpqus6m5unwqrsdetcmr8rz6tp X-Rspamd-Queue-Id: 54530140004 X-Rspamd-Server: rspam11 X-HE-Tag: 1727885337-804988 X-HE-Meta: U2FsdGVkX183mg0/RK20w3PR1MUGLY8EKCyoo4ytd1oSdJ3ZoSn5beyXaKOaCAdoAkBGLmxfhtoyNMgjf2ul3q8ElgA+jf5y7YCQ6kljzTiJwHAS6E/1IpiOrWX7OaFkC8zj9DE9e0Xtlc+TqEc2LBo8cs5tFTyGDu4JHMQIsiWYgnapDy4OHNPp0fQMXhXT1FHWPwi7zCS/HoSmIH7WjUyCfBrmJ5H646A3X8wHxAI2F5LhTl/bHVlXEPx48vFMLDs0oIbROqKi8EvzaE5EIjqlvUUHNfKOSC3F73OmIeDIgMDiq1Z84iqDdD2/90IuH+BVnDKugZHfx7Ap4PHE3M5xRD5HRoYB3QUB+av248SsRjrRhflEKwcHky72lHEQuwV2bbzVvYWxsiAxrD7IcNQDscQwGnbbr6XOcCwDyne6+fzEGY3mCJvLDAYM84AxBpphkhME1KR9NzCQAwg/mDtEmKtzv7s2ehg7BDF0TdEEvhB8LwsBaWUOJ9Pqk2D901Mal93Y9zyewDuXbmGLphrc5GB5CQWrLSqpS1XZ+/t6yGU9jPGhHVRgybaeKEGlOPn8U9Fq4t/BSOSgBokNrAyde6Mzz3pw32MsbPjOvaT9molgDnr/yLi6lytPgbv8ux+rpY75OO+DKKhJlYnTVK3monv6awJaDbw9X400TUNCCbBunx7SmrYXz1/ScNxrsvdY1CYqNAqlT5p67To07gf3o77/xEp8oLBEjt7MNdNu+s9Xq/ykXMVEPvRDnXnj/wm1DMkcKWrspVQ5oAsSWeeApKo48mMEAbuo222ohwp5rfCaMs9uQuPk3c3Gv+s+9rkfhpfebgn5ywabSci7B9jxHvFFZQ0+/QHC2h9EwXEcRI9Jt1xvBP/1Vp2Iw6DaHM+aFsyuHxPrZZoBVLgDWDHbK6r6lUQbpcYfvpDutqJ134bCp09VWrIaWbvx5LE/N1+sTniwAhkhnuH9b1S dYi3AP32 RrtrQuvPyJfYDXo2A9X7UcMkHVk1Y5j4je7aKPFZhDlpxoirVXFKOfAjkLSiVPxIOIH/IxJMEzs2xkFSw6uuppuNdc4YoaoNhgtRYSp7bz/SOuzLtTllvs0oYnR8T88iSFcYHveGkNLtn5va5NkWCEdKLErUO1djCkhSjZElAd0QUuK298Sz2DdelLCt7tm/M+5cbcud2E+QoKYVCJ/JnWDX6fH/XFkR8etUdetMD9yeUI4hiHZHy3xDlIe4dfydQxyoQACMBm0QIeoEvChnUHtT0Idy5MLdvMVIhs8KqxH8HnilZnzsBkgeEgKpW/qMH38LM6RDeqyhFU3SthRUnQssADNj9m0h8R5JG8jNT77UcP9Y= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This is an early ugly hack just for now. Will be completely redone later. This abuses crashkernel segment of memory for the kstate purposes to save and restore object descriptions. The proper solution probably would be using segments in ordinary kexec mechanism, however since kstate requires such segments very late (at reboot stage, not the load stage) some thought and work will be required to make that happen. The KEXEC_FILE_MIGRATE/KEXEC_TYPE_MIGRATE flags also likely won't be required. Signed-off-by: Andrey Ryabinin --- arch/x86/kernel/kexec-bzimage64.c | 36 ++++++++++++++++++++++++++++++ arch/x86/kernel/machine_kexec_64.c | 5 ++++- include/linux/kexec.h | 6 +++-- include/uapi/linux/kexec.h | 2 ++ kernel/crash_core.c | 3 ++- kernel/kexec_core.c | 10 ++++++++- kernel/kexec_file.c | 15 +++++++++++-- 7 files changed, 70 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c index 68530fad05f74..71c82841e6b12 100644 --- a/arch/x86/kernel/kexec-bzimage64.c +++ b/arch/x86/kernel/kexec-bzimage64.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include @@ -77,6 +78,11 @@ static int setup_cmdline(struct kimage *image, struct boot_params *params, len = sprintf(cmdline_ptr, "elfcorehdr=0x%lx ", image->elf_load_addr); } + if (image->type == KEXEC_TYPE_MIGRATE) { + len = sprintf(cmdline_ptr, + "migrate_stream=0x0%llx ", crashk_res.start); + } + memcpy(cmdline_ptr + len, cmdline, cmdline_len); cmdline_len += len; @@ -389,6 +395,29 @@ static int bzImage64_probe(const char *buf, unsigned long len) return ret; } +static int load_migrate_segments(struct kimage *image) +{ + int ret; + struct kexec_buf kbuf = { .image = image, .buf_min = 0, + .buf_max = ULONG_MAX, .top_down = false }; + + kbuf.bufsz = 4096; + kbuf.buffer = vzalloc(kbuf.bufsz); + + kbuf.memsz = 8*1024*1024; + + kbuf.buf_align = ELF_CORE_HEADER_ALIGN; + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; + ret = kexec_add_buffer(&kbuf); + if (ret) + return ret; + image->mig_stream = kbuf.mem; + kexec_dprintk("kstate: Loaded mig_stream at 0x%lx bufsz=0x%lx memsz=0x%lx\n", + image->mig_stream, kbuf.bufsz, kbuf.memsz); + + return ret; +} + static void *bzImage64_load(struct kimage *image, char *kernel, unsigned long kernel_len, char *initrd, unsigned long initrd_len, char *cmdline, @@ -444,6 +473,13 @@ static void *bzImage64_load(struct kimage *image, char *kernel, } #endif + if (image->type == KEXEC_TYPE_MIGRATE) { + ret = load_migrate_segments(image); + if (ret) + return ERR_PTR(ret); + + } + /* * Load purgatory. For 64bit entry point, purgatory code can be * anywhere. diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c index 9c9ac606893e9..edf6234b75baf 100644 --- a/arch/x86/kernel/machine_kexec_64.c +++ b/arch/x86/kernel/machine_kexec_64.c @@ -572,7 +572,10 @@ static void kexec_mark_crashkres(bool protect) kexec_mark_range(crashk_low_res.start, crashk_low_res.end, protect); /* Don't touch the control code page used in crash_kexec().*/ - control = PFN_PHYS(page_to_pfn(kexec_crash_image->control_code_page)); + if (kexec_image && kexec_image->type & KEXEC_TYPE_MIGRATE) + control = PFN_PHYS(page_to_pfn(kexec_image->control_code_page)); + else if (kexec_crash_image) + control = PFN_PHYS(page_to_pfn(kexec_crash_image->control_code_page)); /* Control code page is located in the 2nd page. */ kexec_mark_range(crashk_res.start, control + PAGE_SIZE - 1, protect); control += KEXEC_CONTROL_PAGE_SIZE; diff --git a/include/linux/kexec.h b/include/linux/kexec.h index f0e9f8eda7a3c..182ef76f21860 100644 --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -299,6 +299,7 @@ struct kimage { unsigned long start; struct page *control_code_page; struct page *swap_page; + unsigned long mig_stream; void *vmcoreinfo_data_copy; /* locates in the crash memory */ unsigned long nr_segments; @@ -312,9 +313,10 @@ struct kimage { unsigned long control_page; /* Flags to indicate special processing */ - unsigned int type : 1; + unsigned int type : 2; #define KEXEC_TYPE_DEFAULT 0 #define KEXEC_TYPE_CRASH 1 +#define KEXEC_TYPE_MIGRATE 2 unsigned int preserve_context : 1; /* If set, we are using file mode kexec syscall */ unsigned int file_mode:1; @@ -401,7 +403,7 @@ bool kexec_load_permitted(int kexec_image_type); /* List of defined/legal kexec file flags */ #define KEXEC_FILE_FLAGS (KEXEC_FILE_UNLOAD | KEXEC_FILE_ON_CRASH | \ - KEXEC_FILE_NO_INITRAMFS | KEXEC_FILE_DEBUG) + KEXEC_FILE_NO_INITRAMFS | KEXEC_FILE_DEBUG | KEXEC_FILE_MIGRATE) /* flag to track if kexec reboot is in progress */ extern bool kexec_in_progress; diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h index 5ae1741ea8ea0..454dc7c8a7d86 100644 --- a/include/uapi/linux/kexec.h +++ b/include/uapi/linux/kexec.h @@ -27,6 +27,8 @@ #define KEXEC_FILE_ON_CRASH 0x00000002 #define KEXEC_FILE_NO_INITRAMFS 0x00000004 #define KEXEC_FILE_DEBUG 0x00000008 +#define KEXEC_FILE_MIGRATE 0X00000010 + /* These values match the ELF architecture values. * Unless there is a good reason that should continue to be the case. diff --git a/kernel/crash_core.c b/kernel/crash_core.c index c1048893f4b68..87b9a52d60352 100644 --- a/kernel/crash_core.c +++ b/kernel/crash_core.c @@ -42,7 +42,8 @@ int kimage_crash_copy_vmcoreinfo(struct kimage *image) if (!IS_ENABLED(CONFIG_CRASH_DUMP)) return 0; - if (image->type != KEXEC_TYPE_CRASH) + if (image->type != KEXEC_TYPE_CRASH && + image->type != KEXEC_TYPE_MIGRATE) return 0; /* diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c index c0caa14880c3b..ca6283d21235e 100644 --- a/kernel/kexec_core.c +++ b/kernel/kexec_core.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -196,7 +197,8 @@ int sanity_check_segment_list(struct kimage *image) * kernel could corrupt things. */ - if (image->type == KEXEC_TYPE_CRASH) { + if (image->type == KEXEC_TYPE_CRASH || + image->type == KEXEC_TYPE_MIGRATE) { for (i = 0; i < nr_segments; i++) { unsigned long mstart, mend; @@ -461,6 +463,7 @@ struct page *kimage_alloc_control_pages(struct kimage *image, break; #ifdef CONFIG_CRASH_DUMP case KEXEC_TYPE_CRASH: + case KEXEC_TYPE_MIGRATE: pages = kimage_alloc_crash_control_pages(image, order); break; #endif @@ -859,6 +862,7 @@ int kimage_load_segment(struct kimage *image, break; #ifdef CONFIG_CRASH_DUMP case KEXEC_TYPE_CRASH: + case KEXEC_TYPE_MIGRATE: result = kimage_load_crash_segment(image, segment); break; #endif @@ -1044,9 +1048,13 @@ int kernel_kexec(void) */ cpu_hotplug_enable(); pr_notice("Starting new kernel\n"); + arch_kexec_unprotect_crashkres(); machine_shutdown(); } + if (kexec_image->type & KEXEC_TYPE_MIGRATE) + save_migrate_state(kexec_image->mig_stream); + kmsg_dump(KMSG_DUMP_SHUTDOWN); machine_kexec(kexec_image); diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index 3eedb8c226ad8..4a576db4141cd 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -293,6 +293,11 @@ kimage_file_alloc_init(struct kimage **rimage, int kernel_fd, } #endif + if (flags & KEXEC_FILE_MIGRATE) { + image->control_page = crashk_res.start; + image->type = KEXEC_TYPE_MIGRATE; + } + ret = kimage_file_prepare_segments(image, kernel_fd, initrd_fd, cmdline_ptr, cmdline_len, flags); if (ret) @@ -360,6 +365,10 @@ SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd, #endif dest_image = &kexec_image; + if (image_type == KEXEC_TYPE_MIGRATE) + if (*dest_image) + arch_kexec_unprotect_crashkres(); + if (flags & KEXEC_FILE_UNLOAD) goto exchange; @@ -428,7 +437,8 @@ SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd, image = xchg(dest_image, image); out: #ifdef CONFIG_CRASH_DUMP - if ((flags & KEXEC_FILE_ON_CRASH) && kexec_crash_image) + if (((flags & KEXEC_FILE_ON_CRASH) && kexec_crash_image) || + ((flags & KEXEC_FILE_MIGRATE) && kexec_image)) arch_kexec_protect_crashkres(); #endif @@ -608,7 +618,8 @@ static int kexec_walk_resources(struct kexec_buf *kbuf, int (*func)(struct resource *, void *)) { #ifdef CONFIG_CRASH_DUMP - if (kbuf->image->type == KEXEC_TYPE_CRASH) + if (kbuf->image->type == KEXEC_TYPE_CRASH || + kbuf->image->type == KEXEC_TYPE_MIGRATE) return walk_iomem_res_desc(crashk_res.desc, IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY, crashk_res.start, crashk_res.end, From patchwork Wed Oct 2 16:07:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrey Ryabinin X-Patchwork-Id: 13820016 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D659CF6D3E for ; Wed, 2 Oct 2024 16:09:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9F3AC6B0131; Wed, 2 Oct 2024 12:09:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9CD8A6B013E; Wed, 2 Oct 2024 12:09:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 733AB6B0131; Wed, 2 Oct 2024 12:09:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 4A4D56B00E9 for ; Wed, 2 Oct 2024 12:09:02 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 9DEC8A0B08 for ; Wed, 2 Oct 2024 16:09:01 +0000 (UTC) X-FDA: 82629146082.02.C8F7103 Received: from forwardcorp1d.mail.yandex.net (forwardcorp1d.mail.yandex.net [178.154.239.200]) by imf11.hostedemail.com (Postfix) with ESMTP id BAA9040021 for ; Wed, 2 Oct 2024 16:08:59 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=yandex-team.com header.s=default header.b=qPH+N7ke; spf=pass (imf11.hostedemail.com: domain of arbn@yandex-team.com designates 178.154.239.200 as permitted sender) smtp.mailfrom=arbn@yandex-team.com; dmarc=pass (policy=none) header.from=yandex-team.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727885169; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jFKAUNHTYomsdDtZ7yDs1yniRUoGKOv+DuJ1bapdrNI=; b=sCc1MDa0BiEu9YLEhh8jk8mkXyaVY8mLRklUweMMC7LgC5y6onCP4MZ+wB7z3kVUolSxlc V6J0rCuBU5ua6MrlCydgVcb7QvDxKgmdTpaUaWUH3OPo1jRNTTpsHdNstNvDaCD4lFFGf7 9PA5rjtRpSnsS6ccnJjOLh9l0GN0AlY= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=yandex-team.com header.s=default header.b=qPH+N7ke; spf=pass (imf11.hostedemail.com: domain of arbn@yandex-team.com designates 178.154.239.200 as permitted sender) smtp.mailfrom=arbn@yandex-team.com; dmarc=pass (policy=none) header.from=yandex-team.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727885169; a=rsa-sha256; cv=none; b=ceKwVzzf/I2iuKGjvSRe+JgiQvv/y++yhKS3g3LmvNaSIr+zIPpuqm4QyXVf3R3BA9UUkn 1IYaI4DeAeEacy1d4zs1YmJgsphztYdw8mozkumaavxZiQJ07ucb3vuG8DjbSWRDJHrdDN 7HX9KySTDxgjDVMk/8r+N7GdEr2ks5o= Received: from mail-nwsmtp-smtp-corp-main-56.klg.yp-c.yandex.net (mail-nwsmtp-smtp-corp-main-56.klg.yp-c.yandex.net [IPv6:2a02:6b8:c42:b1cb:0:640:2a1e:0]) by forwardcorp1d.mail.yandex.net (Yandex) with ESMTPS id 41C2460A74; Wed, 2 Oct 2024 19:08:58 +0300 (MSK) Received: from dellarbn.yandex.net (unknown [10.214.35.248]) by mail-nwsmtp-smtp-corp-main-56.klg.yp-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id Z8emWD2IhiE0-cMZ3bEDP; Wed, 02 Oct 2024 19:08:57 +0300 X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.com; s=default; t=1727885337; bh=jFKAUNHTYomsdDtZ7yDs1yniRUoGKOv+DuJ1bapdrNI=; h=Message-ID:Date:In-Reply-To:Cc:Subject:References:To:From; b=qPH+N7keP7fgNy1yVdHKgLdk6Vbwy5VUI37bXZuORZnoZCxar0lhhrLgbGMX/rOtr LTEiQPxkzr/9Y1AEX6RNxsh7BzoT/zDNJBuWYiGmRAwpny8Drj92A+n8avkCV2ys3e +ZVTj+7zFN+pwSdGU0uASrLQaonWOGPG+JWQX8yM= From: Andrey Ryabinin To: linux-kernel@vger.kernel.org Cc: Alexander Graf , James Gowans , Mike Rapoport , Andrew Morton , linux-mm@kvack.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Eric Biederman , kexec@lists.infradead.org, Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , linux-trace-kernel@vger.kernel.org, valesini@yandex-team.com, Andrey Ryabinin Subject: [RFC PATCH 3/7] [hack] purgatory: disable purgatory verification. Date: Wed, 2 Oct 2024 18:07:18 +0200 Message-ID: <20241002160722.20025-4-arbn@yandex-team.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241002160722.20025-1-arbn@yandex-team.com> References: <20241002160722.20025-1-arbn@yandex-team.com> MIME-Version: 1.0 X-Yandex-Filter: 1 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: BAA9040021 X-Stat-Signature: 9k3y3gdneqej3ms1dqrgpijqa68pbb8d X-Rspam-User: X-HE-Tag: 1727885339-360168 X-HE-Meta: U2FsdGVkX1+olKKH5zRH6oZj5H8nvRatPWpmsgi23n3DwPFkqb1yWmYypKxmIfrQgj8qRJovDfYBMW+oOO6h4I/ZTCr5fhSeGiK3r7aLRlaOTlO8nXU2R5SmCzDIKJxSuGBPARnkuXwBE3yxl0WnzEZxdyR1N/Xbi503NM5aYa099wcHIkF/zvH46Y4GzTzAUgxRlszHFUHdejEVPk4TCXp/L56cIfAzBruyqaFM/PcAhk82v81xHGj2ISfSf8N0ibENeAVdZm8dJJsVD/2vyNFhCmEz4IlkRd5kiPUfks9uQm2+WrSAs293Jed+eb5JItwkIsEGtMonFwbgBXNgeA1MK2BknQOlYAv6z2zmOH1sMRk/1GV5MTPdaAIma2T28mxAGMYwuuv2Y7A0O8e/WQnH6U+IYX0xZiy2RNVsA4+M/OND/1vdukEDIpolDdQKHbYGkICSt/c0ZLvYOcXAkRVMBxDZaclXIgOPTYrcCNHKLAnMDdYe0PjWfNs6vXHeEIl6NWmqxts1MdH5yOPys0zQ7YQ5XOQ53hMW1yNEReetbXq8leyBdag2/mLXAD+ifauOdM+PSlnxBhOQb8emHjMSyzgxqHotWvmW2DQQXU328udnBI+gXyvJI5gpWHcwTPTswCFliMUiP0XIr2tyRsEjwdfq281r2SLTBcCL83BbH3OZEvfDH6Z3yTYJjZCdg5JABuUOIgkr9jMqprp7Gs/sFHDWZK70KTBz5GrN/JePKh5oNGqUyfRudCjf1sh8dlNVdj2DSsz7Rj4RqUc73op7R3f4ceiffFKzEyjD+BwJkabasod0gAUU1xLS8jQyrAFGQRf7fkC99RTVdnXVYy+k7ShqRnQe328AgNZbUOtjvkFO2Ib2jOFdg+7t6Sfi6mwU0pbBMU78MkNuO5dw8SBZYIC6/R0/3KV9Dqj2LJlT06/e6gwFOFXQQL5H+HSE3yamJtj6oa2cJYYLphI zwb3q22M 0N7FqNiY26/pVyNCY/ll3UuZ6hbEIbDPhOr0ODn5tb6e2XAckGxsYpS37XlQrMO9tQyCYfx8Ol/KRG9L7bcG8xMvlLk/SnBitErD+vNen9iNuw+LaxTAzYZjGJK///0UX967/W4bPk+5duuo19y0V88jZhCuDEOjPjiMAQ3VBpobSCVLUDysEFCoEyl2DkElit6GlrHhSsYyxtdbaFjRGYTPrlLPOZr2i+iHcBwPUVnf/DMDbx2JNkyEopN6iQuEuHTFj0GRmC1dfXofwGWUr58sMjGqAJc33fvIF0Flc2d1MTAOu/XulVpAbj4Pp7fgbWUqsm3ZqZwbJox7Y9h/mOn86txnKFn985p5CSbgkMqaC74JWGE2EJO/a608GfOV/9dOW X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Kstate changes data in kexec segments after the calculation of the checksum, so we don't pass purgatroy verification stage. Disable it for now. Proper solution will be later, in next versions of the patchset. Signed-off-by: Andrey Ryabinin --- arch/x86/purgatory/purgatory.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/purgatory/purgatory.c b/arch/x86/purgatory/purgatory.c index aea47e7939637..cdec5f21282a7 100644 --- a/arch/x86/purgatory/purgatory.c +++ b/arch/x86/purgatory/purgatory.c @@ -45,6 +45,8 @@ void purgatory(void) { int ret; + if (IS_ENABLED(CONFIG_KSTATE)) + return; ret = verify_sha256_digest(); if (ret) { /* loop forever */ From patchwork Wed Oct 2 16:07:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrey Ryabinin X-Patchwork-Id: 13820017 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EDAECF6D3B for ; Wed, 2 Oct 2024 16:09:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A685D4401BF; Wed, 2 Oct 2024 12:09:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A19424401B5; Wed, 2 Oct 2024 12:09:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 86BEA4401BF; Wed, 2 Oct 2024 12:09:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 5952B4401B5 for ; Wed, 2 Oct 2024 12:09:07 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 0711E1A0AFF for ; Wed, 2 Oct 2024 16:09:07 +0000 (UTC) X-FDA: 82629146334.19.F617FB9 Received: from forwardcorp1d.mail.yandex.net (forwardcorp1d.mail.yandex.net [178.154.239.200]) by imf01.hostedemail.com (Postfix) with ESMTP id EA1E74001E for ; Wed, 2 Oct 2024 16:09:04 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=yandex-team.com header.s=default header.b=Q0TLXURt; spf=pass (imf01.hostedemail.com: domain of arbn@yandex-team.com designates 178.154.239.200 as permitted sender) smtp.mailfrom=arbn@yandex-team.com; dmarc=pass (policy=none) header.from=yandex-team.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727885175; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vDCJtHtK3uU2sdBk8J/tnuJVXLHuR9McbyPPpAX4VPs=; b=yMVz/C6Zw9om6QvPF9cHWiprzWS9eIEvM93yxUFkqTzCQ6zP5WSs3Uh4LzV22ZJQZUuJar d7tLLaMiaZIfm8uzZGz/jglrF0YmB8UryMWug2PWCoPXKwWk0rBSp0PoBHKQ/3PLzMKTSd q4o7L+X96Zynlb/qygFCPtC9zJLxk3Y= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=yandex-team.com header.s=default header.b=Q0TLXURt; spf=pass (imf01.hostedemail.com: domain of arbn@yandex-team.com designates 178.154.239.200 as permitted sender) smtp.mailfrom=arbn@yandex-team.com; dmarc=pass (policy=none) header.from=yandex-team.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727885175; a=rsa-sha256; cv=none; b=xJFAASRPodW4bcUbtpGY4CX0XwzxXp9UlKcX8k0AaoEdQzmm1v5sE0QMoG5dgGYZMR/Nvj nIo4e1ZneYH9NBTV+oslRMdkZ+g+5MYD0TzESpu+vjFzpjbNgd9gEZlmHuN8Hc76W2ukjk Gxq18alLCiCAjedU92L+WDc+NOUrFDM= Received: from mail-nwsmtp-smtp-corp-main-56.klg.yp-c.yandex.net (mail-nwsmtp-smtp-corp-main-56.klg.yp-c.yandex.net [IPv6:2a02:6b8:c42:b1cb:0:640:2a1e:0]) by forwardcorp1d.mail.yandex.net (Yandex) with ESMTPS id 62E2A609C4; Wed, 2 Oct 2024 19:09:03 +0300 (MSK) Received: from dellarbn.yandex.net (unknown [10.214.35.248]) by mail-nwsmtp-smtp-corp-main-56.klg.yp-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id Z8emWD2IhiE0-vYLh3gEa; Wed, 02 Oct 2024 19:09:02 +0300 X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.com; s=default; t=1727885342; bh=vDCJtHtK3uU2sdBk8J/tnuJVXLHuR9McbyPPpAX4VPs=; h=Message-ID:Date:In-Reply-To:Cc:Subject:References:To:From; b=Q0TLXURt2M11h05HQcTCotDXqV17/pTO/eYHxyvF7CpIRwLEfbBfbhK5guCrZhW7i FTOE8WK8gOdDKUCtCqsclUk1GbjcM0KngfLxMtHpSJRo9cn3UXM9NGQ39FLI8zLwTl y0g4IX5ZfCyGA9n7LHUYQ9btKkyR6EMQ9Tfr2MTA= From: Andrey Ryabinin To: linux-kernel@vger.kernel.org Cc: Alexander Graf , James Gowans , Mike Rapoport , Andrew Morton , linux-mm@kvack.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Eric Biederman , kexec@lists.infradead.org, Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , linux-trace-kernel@vger.kernel.org, valesini@yandex-team.com, Andrey Ryabinin Subject: [RFC PATCH 5/7] kstate: Add mechanism to preserved specified memory pages across kexec. Date: Wed, 2 Oct 2024 18:07:20 +0200 Message-ID: <20241002160722.20025-6-arbn@yandex-team.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241002160722.20025-1-arbn@yandex-team.com> References: <20241002160722.20025-1-arbn@yandex-team.com> MIME-Version: 1.0 X-Yandex-Filter: 1 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: EA1E74001E X-Stat-Signature: dk1ckbfrhjsfq77x11y3tqmx53dk6hue X-Rspam-User: X-HE-Tag: 1727885344-182220 X-HE-Meta: U2FsdGVkX18Zg8FG4U4MWFaQA8wbcGy4GbveRMQ7sTB8emh+ejPAwumsvNX3Yg/GsnOvapQjGJvodd69q5lNDHB1C/cvhpaumsbQNJTW8Ad/a7bS105Lqhd8T/3ziArLjfn0ZrmknynSqanBIsVF+qz9a/zpP3CZijf65OEqoAY5SDfklYva49hgCynEmmSeAF9UH9bw3TLdiZk2AQn1Dl7riYHg4/BwDchyw9qX9N3yxw6ng/MB3pJVq9QpXxp5uVgYWxSSl6cNbfd+OHyC8n/BOHoH3kTdHaZBsLyr8YLLSYKvON9znP1hiiVp7zVUpddI5efbIQ8y/i+HhQBiV+7w84S3TkMw60SrOHKOYdBOM0kTmmAACstZojI9ZPEY8jkQspQmbE9ltZRScMP2OW5UUc1eYNg0aY/XoTmdKc52G/g6MC2jJfRI3E9EWvcqks78QfZc5csEa7vAizPPA73sJoIrdwyXRlx7WuZjQOwESC4a+bkECh1qu4KKI6vPYBuNcr2ej98x51flJ4RW1ryHPXfawBy29JlPdSTiCnpy5CF9Fgoax1sACH4A2/xeeJaHxyzmiTXH/4s0uP4ERX0p77KD9NbrO4A5IBmTPQKkheV55sTna1mK0vcpPJkgLeKhp04/+KYHCj2j9aucijxA39j8dCkckPlB7BYJdUXpw4Ongscfoj+iR+AL8bZuq9jp80oY66pkV6qIXaWVNp7MC2LR5Sm93bqH/rii3U+iIiQsMFxUa8oApzq5+pbrqOsG2wSz/8a48+jDmh5HwVlCKWU5kjT0AiIBABPSUzvqgKlpiBmTK2dXz+CutKmAiaBUNrOLVt0loHWEs4IDbt6QigO6U/Mj9zpMsfvvQ+EdMs+oq+1xNZoDPkhR1/w/IBbnn7prOrCWrUbOXfdvk//vCEzFHokzihhttOQH51vamuwGFBRQ1JhNqcP9L66a63f6W28cVGVO0rvqVXx EtiH1wJm ydg1b8XJ45gacWZehPOh6RgKQVWldRziWsokaUMQ/+k61rWOjo6DFXTOL9QN045Cbfe4J5mBkeX4+zLeb2SpeXTNyFAGCqmQDrVHoae6StQVwQwva0RBOyOQH3DHyjzGwN5BrK7PkyOsgsOuYVakBDF45YK/p4B6zLVT92ciVxrFFP3CCkAsvy62vYA/g3osM+OVDxGPP8cpoCvv+IpCdJsUvB9KcJXSf3GdPlDjFeHw2zrPFiDwtjvGg8UTPJjAyGjZOQ7CkF+8vrecTuF7FVOTBEHC+klyH/HdhwMI8XTSiCSoSETfzKQEvJgv5gm8TagGnUHKIwVPRTc6FabUIeixsq/5CyaTS8/W7iQt9Qwrsim4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This adds functionality to preserve memory pages across kexec. kstate_register_page() stores struct page in the special list of 'struct page_state's. At kexec reboot stage this list iterated, pfns saved into kstate's migrate stream. The new kernel after kexec reads pfns from the stream and marks memory as reserved to keep it intact. Also it marked with MEMBLOCK_PRSRV flag indicating that 'struct page' itself shouldn't be reinitialized. Signed-off-by: Andrey Ryabinin --- arch/x86/kernel/kexec-bzimage64.c | 2 +- arch/x86/kernel/setup.c | 81 +++++++++++++++++++++++++++++++ include/linux/kstate.h | 6 +++ kernel/kstate.c | 7 +++ 4 files changed, 95 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c index 71c82841e6b12..d769d08cf9a8a 100644 --- a/arch/x86/kernel/kexec-bzimage64.c +++ b/arch/x86/kernel/kexec-bzimage64.c @@ -406,7 +406,7 @@ static int load_migrate_segments(struct kimage *image) kbuf.memsz = 8*1024*1024; - kbuf.buf_align = ELF_CORE_HEADER_ALIGN; + kbuf.buf_align = PAGE_SIZE; kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; ret = kexec_add_buffer(&kbuf); if (ret) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index f1fea506e20f4..cfddc902e266b 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include #include @@ -638,6 +639,85 @@ static void __init e820_add_kernel_range(void) e820__range_add(start, size, E820_TYPE_RAM); } +#ifdef CONFIG_KSTATE +struct state_entry mem_kstate; + +struct mem_state { + unsigned int nr_pages; + struct list_head list; +}; +struct page_state { + struct list_head list; + int order; + struct page *page; +}; + +struct mem_state m_state = { .list = LIST_HEAD_INIT(m_state.list) }; + +int kstate_register_page(struct page *page, int order) +{ + struct page_state *state; + + state = kmalloc(sizeof(*state), GFP_KERNEL); + if (!state) + return -ENOMEM; + + state->page = page; + state->order = order; + list_add(&state->list, &m_state.list); + m_state.nr_pages++; + return 0; +} + +static int kstate_pages_save(void *mig_stream, void *obj, + const struct kstate_field *field) +{ + struct page_state *p_state; + void *start = mig_stream; + + list_for_each_entry(p_state, &m_state.list, list) { + mig_stream = kstate_save_byte(mig_stream, p_state->order); + mig_stream = kstate_save_ulong(mig_stream, page_to_phys(p_state->page)); + } + return mig_stream - start; +} + +static int __init kstate_pages_restore(void *mig_stream, void *obj, + const struct kstate_field *field) +{ + struct mem_state *m_state = obj; + int nr_pages, i; + + nr_pages = m_state->nr_pages; + for (i = 0; i < nr_pages; i++) { + int order = kstate_get_byte(&mig_stream); + unsigned long phys = kstate_get_ulong(&mig_stream); + + memblock_reserve(phys, PAGE_SIZE << order); + memblock_reserved_mark_preserved(phys, PAGE_SIZE << order); + } + return 0; +} + +struct kstate_description kstate_reserved = { + .name = "reserved_mem", + .id = KSTATE_RSVD_MEM_ID, + .state_list = LIST_HEAD_INIT(kstate_reserved.state_list), + .fields = (const struct kstate_field[]) { + KSTATE_SIMPLE(nr_pages, struct mem_state), + { + .name = "pages", + .flags = KS_CUSTOM, + .size = sizeof(struct mem_state), + .save = kstate_pages_save, + .restore = kstate_pages_restore, + }, + + KSTATE_END_OF_LIST() + }, +}; +#endif + static void __init early_reserve_memory(void) { /* @@ -989,6 +1069,7 @@ void __init setup_arch(char **cmdline_p) memblock_set_current_limit(ISA_END_ADDRESS); e820__memblock_setup(); + __kstate_register(&kstate_reserved, &m_state, &mem_kstate); /* * Needs to run after memblock setup because it needs the physical diff --git a/include/linux/kstate.h b/include/linux/kstate.h index c97804d0243ea..855acb339d5d7 100644 --- a/include/linux/kstate.h +++ b/include/linux/kstate.h @@ -29,6 +29,8 @@ struct kstate_field { }; enum kstate_ids { + KSTATE_PAGE_ID, + KSTATE_RSVD_MEM_ID, KSTATE_LAST_ID = -1, }; @@ -87,6 +89,10 @@ void *save_kstate(void *stream, int id, const struct kstate_description *kstate, void *obj); void *restore_kstate(struct kstate_entry *ke, int id, const struct kstate_description *kstate, void *obj); + +int kstate_page_save(void *mig_stream, void *obj, + const struct kstate_field *field); +int kstate_register_page(struct page *page, int order); #else #define __kstate_register(state, obj, se) diff --git a/kernel/kstate.c b/kernel/kstate.c index 0ef228baef94e..7f7e135bafd81 100644 --- a/kernel/kstate.c +++ b/kernel/kstate.c @@ -182,6 +182,13 @@ int kstate_register(struct kstate_description *state, void *obj) return 0; } +int kstate_page_save(void *mig_stream, void *obj, + const struct kstate_field *field) +{ + kstate_register_page(*(struct page **)obj, 0); + return 0; +} + static int __init setup_migrate(char *arg) { char *end; From patchwork Wed Oct 2 16:07:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrey Ryabinin X-Patchwork-Id: 13820018 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 988ADCF6D3E for ; Wed, 2 Oct 2024 16:09:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3D9906B03A3; Wed, 2 Oct 2024 12:09:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 38A486B03A5; Wed, 2 Oct 2024 12:09:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 190376B03A4; Wed, 2 Oct 2024 12:09:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D54D16B0384 for ; Wed, 2 Oct 2024 12:09:09 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 72F411A0962 for ; Wed, 2 Oct 2024 16:09:09 +0000 (UTC) X-FDA: 82629146418.09.0A18FB4 Received: from forwardcorp1d.mail.yandex.net (forwardcorp1d.mail.yandex.net [178.154.239.200]) by imf18.hostedemail.com (Postfix) with ESMTP id 6BF501C001D for ; Wed, 2 Oct 2024 16:09:07 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=yandex-team.com header.s=default header.b=QJ6MOGmC; spf=pass (imf18.hostedemail.com: domain of arbn@yandex-team.com designates 178.154.239.200 as permitted sender) smtp.mailfrom=arbn@yandex-team.com; dmarc=pass (policy=none) header.from=yandex-team.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727885220; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0gXnv/qrKYuE/cS8ZBx3ikqYeKs1eCUWKuL7cMB0Reg=; b=h5gN1Y7+BkKtTRbJqggagf9+dSy18Avpyo/+2NMw/ZtT989g6ikZx/rKbH/Te9wkxCtkjL gsHpOka3F9n/mgdNXMWEzXPEoTbF4N7/kyHQB0/aeOUxEKs+YSHy0T510heywIrgMRzZGn VqpkOf8zNOG1h7Peg0TtOrnCxFJo3rM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727885220; a=rsa-sha256; cv=none; b=6oYssR9/NsvRs+7Y1sA6AZWhLZZBi7uInrB+5eOPacEBrFAfyRsPbLUS4STz4jjhrCeOCl 9wpcgT+VvnEyzsfViwAA34zcU05+gMjVd2eCJcDky21suJfFj17urtbEHddrr5rKhGCVZB O7UnhG4RGg6e46gO8FwImpvFMierlbY= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=yandex-team.com header.s=default header.b=QJ6MOGmC; spf=pass (imf18.hostedemail.com: domain of arbn@yandex-team.com designates 178.154.239.200 as permitted sender) smtp.mailfrom=arbn@yandex-team.com; dmarc=pass (policy=none) header.from=yandex-team.com Received: from mail-nwsmtp-smtp-corp-main-56.klg.yp-c.yandex.net (mail-nwsmtp-smtp-corp-main-56.klg.yp-c.yandex.net [IPv6:2a02:6b8:c42:b1cb:0:640:2a1e:0]) by forwardcorp1d.mail.yandex.net (Yandex) with ESMTPS id E548260A23; Wed, 2 Oct 2024 19:09:05 +0300 (MSK) Received: from dellarbn.yandex.net (unknown [10.214.35.248]) by mail-nwsmtp-smtp-corp-main-56.klg.yp-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id Z8emWD2IhiE0-nRmmw9QO; Wed, 02 Oct 2024 19:09:05 +0300 X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.com; s=default; t=1727885345; bh=0gXnv/qrKYuE/cS8ZBx3ikqYeKs1eCUWKuL7cMB0Reg=; h=Message-ID:Date:In-Reply-To:Cc:Subject:References:To:From; b=QJ6MOGmC2WXQfZHzJZ1VaR1B8rKzNr7QnMVdzqdp+QQv3P0S3FHMiPewEj2rscP08 wpInBKldjvUn90+GV+UmlmIIYMJDqh6JpsBqKYkROXaqaEMgJ2zuIgRq06YpaV+5qL OvdxqE3T7r9VwN+6lIu4k91/bcBG6keVqTB4vvQk= From: Andrey Ryabinin To: linux-kernel@vger.kernel.org Cc: Alexander Graf , James Gowans , Mike Rapoport , Andrew Morton , linux-mm@kvack.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Eric Biederman , kexec@lists.infradead.org, Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , linux-trace-kernel@vger.kernel.org, valesini@yandex-team.com, Andrey Ryabinin Subject: [RFC PATCH 6/7] kstate, test: add test module for testing kstate subsystem. Date: Wed, 2 Oct 2024 18:07:21 +0200 Message-ID: <20241002160722.20025-7-arbn@yandex-team.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241002160722.20025-1-arbn@yandex-team.com> References: <20241002160722.20025-1-arbn@yandex-team.com> MIME-Version: 1.0 X-Yandex-Filter: 1 X-Rspamd-Queue-Id: 6BF501C001D X-Stat-Signature: jfainoec9dc9k3tr1u8y3f17943kg56r X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1727885347-950031 X-HE-Meta: U2FsdGVkX19VvuJyL/PNMvcPDK4iAZJWnj+LWvlE9FcJBBbj5XZdZYgM0PvScCD/DLQLZKz5MvxxAMg6oRcfaajHuv/gICG2cMybnTF4JnP0rDVrccKvye7AUnx2Xavhlg4ke1RcNHqDiQ+uBCWk99c3/cpJfEDheVU4RhuLH6n4l63/fOhUp2VDPJ0DkwUbqJaoA9ZyUgJ/Vt4LyQYVEUWsxVqAze7C13KiDCCCdOG++Q0LEaAWQ/KQ2hmpNZl/vN9zYWBjW0TfLZY92K4aenJqJ7g0idlAquYPTD/Jp4zZVae4yVVIviNTgUAR6VPu2Lb8mBUhSj2t+ox08481uiS4/ne7n704JrExrigrQbPmE/BloV1nDZ6hLLb8ZgNr9qkhGevpuPM+58n6OWVHEfZOn9O1XuvX7+QP4ToxImfYH2NjOnjcp+rFCt9uDLuW49UGDGaYmB8YUhHaQGdEVX8W1dGGftYNGUe89HVmRJmxcrWCTY5tPMNUtOxbSr5nixBYhthC4l23CXouaubUtaGOo4Gq0Es9EQdNnrx3SzEkmXBWPm0KxPHkxPnHCwMmRWsStRI9h5NAc6xXsGQM8iP8g8EYcJ4TgGb2fb1fxJ9Tp4FSnjDbYovAnrlsjFbLHN+jepd/0Y7/kbXmNHyKYhbZom28EJ7G9BHIsNDniUh1JEjD1EwgABog/P4KuruEZ7oDCTrnq+KVcZC23D6CgRYmAOsAdyzmWlqHo7AZFR3p84ln4o7Bxw391ynjx+woxJkIsGU+staAoLCoevZ7q+OCznBmZ1LzgkiavncaYlZEAndspROSsIMPgLcsNy2yc36XS7H1LL2OJWMXJl7kKzNxZG7WRB/j6qpYbQve0yMgOPhQjbikJ0W2DLQK5Q8csg2VEXujkTaf0n4zXbB85V1G1AbvjEyrrclntCYtalsmbqv0ER2zBsWUUdXenter+8CeFIlWvX8EqJs5O23 DtAaD+Tz 5fyAZUYfDz+2WEQcReqj99Zv4XF5Xl3uXX5ttuDYe8BKQLNuYXxmcpfCgehQYt48O2QGLMrDB+z9uGdCjWKuZKD+HHl4W3glEs04gFDy4h7I4kUEFrtjiJud2cTYg/PwzNpY24FlF6yuXJYABZiDvYAj01/vnyqUbgHjJeHS7xgSnwas/LRS4qoPNeqtVeEK6IczIEJ8Od3arImczqQ5R0qRYoaRivaDO/xh0qejW20eG98hyYnTlLncbxaxEoyhaNP/XNn07y3pBlEfoa/xPwA3HqBdKk60sUBIqH8/EToDDF0WKRYW61d3V4eU/wjDcvLeqohjpX4dVW4TKt3yvVMwwUDs4A5pDXcGZm8XXm41qJo0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This is simple test and playground useful kstate subsystem development. It contains some structure with different kind of data which migrated across kexec to the new kernel using kstate. Signed-off-by: Andrey Ryabinin --- include/linux/kstate.h | 1 + lib/Makefile | 2 + lib/test_kstate.c | 89 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 92 insertions(+) create mode 100644 lib/test_kstate.c diff --git a/include/linux/kstate.h b/include/linux/kstate.h index 855acb339d5d7..2ddbe41a1f171 100644 --- a/include/linux/kstate.h +++ b/include/linux/kstate.h @@ -31,6 +31,7 @@ struct kstate_field { enum kstate_ids { KSTATE_PAGE_ID, KSTATE_RSVD_MEM_ID, + KSTATE_TEST_ID, KSTATE_LAST_ID = -1, }; diff --git a/lib/Makefile b/lib/Makefile index 773adf88af416..2432e47664c35 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -354,6 +354,8 @@ obj-$(CONFIG_PARMAN) += parman.o obj-y += group_cpus.o +obj-$(CONFIG_KSTATE) += test_kstate.o + # GCC library routines obj-$(CONFIG_GENERIC_LIB_ASHLDI3) += ashldi3.o obj-$(CONFIG_GENERIC_LIB_ASHRDI3) += ashrdi3.o diff --git a/lib/test_kstate.c b/lib/test_kstate.c new file mode 100644 index 0000000000000..e95e3110f8949 --- /dev/null +++ b/lib/test_kstate.c @@ -0,0 +1,89 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include + +unsigned long ulong_val; +struct kstate_test_data { + int i; + unsigned long *p_ulong; + char s[10]; + struct page *page; +}; + +struct kstate_description test_state = { + .name = "test", + .version_id = 1, + .id = KSTATE_TEST_ID, + .state_list = LIST_HEAD_INIT(test_state.state_list), + .fields = (const struct kstate_field[]) { + KSTATE_SIMPLE(i, struct kstate_test_data), + KSTATE_SIMPLE(s, struct kstate_test_data), + KSTATE_POINTER(p_ulong, struct kstate_test_data), + { + .name = "page", + .flags = KS_CUSTOM, + .offset = offsetof(struct kstate_test_data, page), + .save = kstate_page_save, + }, + KSTATE_SIMPLE(page, struct kstate_test_data), + KSTATE_END_OF_LIST() + }, +}; + +static struct kstate_test_data test_data; + +static int init_test_data(void) +{ + struct page *page; + int i; + + test_data.i = 10; + ulong_val = 20; + memcpy(test_data.s, "abcdefghk", sizeof(test_data.s)); + page = alloc_page(GFP_KERNEL); + if (!page) + return -ENOMEM; + + for (i = 0; i < PAGE_SIZE/4; i += 4) + *((u32 *)page_address(page) + i) = 0xdeadbeef; + test_data.page = page; + return 0; +} + +static void validate_test_data(void) +{ + int i; + + WARN_ON(test_data.i != 10); + WARN_ON(*test_data.p_ulong != 20); + WARN_ON(strcmp(test_data.s, "abcdefghk") != 0); + + for (i = 0; i < PAGE_SIZE/4; i += 4) { + u32 val = *((u32 *)page_address(test_data.page) + i); + + WARN_ON(val != 0xdeadbeef); + } +} + +static int __init test_kstate_init(void) +{ + int ret = 0; + + test_data.p_ulong = &ulong_val; + + if (!is_migrate_kernel()) { + ret = init_test_data(); + if (ret) + goto out; + } + + kstate_register(&test_state, &test_data); + + validate_test_data(); + +out: + return ret; +} +__initcall(test_kstate_init); From patchwork Wed Oct 2 16:07:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrey Ryabinin X-Patchwork-Id: 13820019 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A79B8CF6D3B for ; Wed, 2 Oct 2024 16:09:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 22E434401C0; Wed, 2 Oct 2024 12:09:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 191124401B5; Wed, 2 Oct 2024 12:09:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ED4C54401C0; Wed, 2 Oct 2024 12:09:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C3AD64401B5 for ; Wed, 2 Oct 2024 12:09:12 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 4F1971C6862 for ; Wed, 2 Oct 2024 16:09:12 +0000 (UTC) X-FDA: 82629146544.07.EC56BB4 Received: from forwardcorp1d.mail.yandex.net (forwardcorp1d.mail.yandex.net [178.154.239.200]) by imf29.hostedemail.com (Postfix) with ESMTP id 2DDBF12002E for ; Wed, 2 Oct 2024 16:09:09 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=yandex-team.com header.s=default header.b=F5+i2ZBr; spf=pass (imf29.hostedemail.com: domain of arbn@yandex-team.com designates 178.154.239.200 as permitted sender) smtp.mailfrom=arbn@yandex-team.com; dmarc=pass (policy=none) header.from=yandex-team.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727885310; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=r6KOHiXGwu9wJr+jMWOsPrgnVfkjQk3UoBVxWjKVRQc=; b=BkmjWHnG6Q2PHQtWVXTemH0Rhae4fmREgHqqd6v1fjVHopNg9Msm/GnieHA0Up1DlP+Hyd jrhEhsE4D99Yl8NaZTWxw+EQxe5G9J9fB5dmCin5sizcL4fPR6EDxy51PjRXbyOaiGeLmf LxcBpUKEKmb8RlF2wI8xbWBc6VpCFbc= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=yandex-team.com header.s=default header.b=F5+i2ZBr; spf=pass (imf29.hostedemail.com: domain of arbn@yandex-team.com designates 178.154.239.200 as permitted sender) smtp.mailfrom=arbn@yandex-team.com; dmarc=pass (policy=none) header.from=yandex-team.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727885310; a=rsa-sha256; cv=none; b=QNcBVxHMZT19Yb+tOn/jV6uU7GCpa6WWv9q+h2KgBcTiUh1l9r5r31UOggsojycbQVH35y ashHkx2ZDxGDPAp0MMMWDGBo9TtxAIGwQnCdCrPPzUu2erMgAblPRI2ZavOJBRMolS/+O1 uqsdUrtLAJdcnbuXckY6CLt8DRU8RuA= Received: from mail-nwsmtp-smtp-corp-main-56.klg.yp-c.yandex.net (mail-nwsmtp-smtp-corp-main-56.klg.yp-c.yandex.net [IPv6:2a02:6b8:c42:b1cb:0:640:2a1e:0]) by forwardcorp1d.mail.yandex.net (Yandex) with ESMTPS id 9F16460949; Wed, 2 Oct 2024 19:09:08 +0300 (MSK) Received: from dellarbn.yandex.net (unknown [10.214.35.248]) by mail-nwsmtp-smtp-corp-main-56.klg.yp-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id Z8emWD2IhiE0-ueku0Pjg; Wed, 02 Oct 2024 19:09:07 +0300 X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.com; s=default; t=1727885347; bh=r6KOHiXGwu9wJr+jMWOsPrgnVfkjQk3UoBVxWjKVRQc=; h=Message-ID:Date:In-Reply-To:Cc:Subject:References:To:From; b=F5+i2ZBr2Uq7UPOSwZbgC7ZA0tYzgqL47KGhDszAsbQa6F6J/7vbR7Ko3bOmn6riv 8JQEDe8b3xruR/nSABrpSw17jrHYUuNZVIz8HuRdoAPk5CYiB7gV3oWoy60iWQ6ke4 u3C0/g9G/pAFLeQBOSS9YnP/FYBQuMXxxe4R9D2w= From: Andrey Ryabinin To: linux-kernel@vger.kernel.org Cc: Alexander Graf , James Gowans , Mike Rapoport , Andrew Morton , linux-mm@kvack.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Eric Biederman , kexec@lists.infradead.org, Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , linux-trace-kernel@vger.kernel.org, valesini@yandex-team.com, Andrey Ryabinin Subject: [RFC PATCH 7/7] trace: migrate trace buffers across kexec Date: Wed, 2 Oct 2024 18:07:22 +0200 Message-ID: <20241002160722.20025-8-arbn@yandex-team.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241002160722.20025-1-arbn@yandex-team.com> References: <20241002160722.20025-1-arbn@yandex-team.com> MIME-Version: 1.0 X-Yandex-Filter: 1 X-Rspam-User: X-Stat-Signature: gqzg8wyi935sr7k8gqphk8841c6po8cg X-Rspamd-Queue-Id: 2DDBF12002E X-Rspamd-Server: rspam11 X-HE-Tag: 1727885349-195764 X-HE-Meta: U2FsdGVkX1+8cCkMFbRX4S+9I/jAVo3MJvn2UGwFpmaI8SaH94MhCwSpNGlu6QYFN+WjPCrvnUjd+AnxjXHuGr+Cbe+C3painLine3Pd4ppdzSBjRWpXBfXjVaaCCkxMdPisXuUJZumUFqRDjCVsreiX07AGL4bmhLfixnH7Fa4ARSD4N/u3XJjHs2EUfyOOgeERAgxstydN7BisPNTcyYmRdYBOP36NVvGHl02fy33atXwUwAXkfZl6A1Uk7lQTHjMLKai8ECglgthSVpfuP6jnOC44OPMyPGGC5iIHhf2neprF0ElR7V44/BNghDq0TRmpI9lWdxZwufp3AKr+OjvslosCZRXWq2Mn7DMJcr6gfoV3sRxasQ1HJRuksepeYli27NI2AQxmLHxhe7xF0v4pc7JXnTs0kQr39INchcgrVrYegBIy32EU4qFlZceeOy6KAPXASdT/yVZVRTrEJ/C+hKClmYU/ypOeP0R6kv5ieXcIMy561WNia19gwIz6xwIABp2y/n3n3xxbMl8iVbzlgvgwFLVNY0Lu2cXtd1cUELzfJex3k9ahx8Dm0z3RD/zzAchgqVBjMftIeMXqxyWjtvrjgoihwafmrM5r1rnihapWLQ2H50Yj+vLa5fkzCTXwHyyt3dunH6eQd3N4xUXPdoOD7O+djK3LjOEDjgFdZk7c0Kujbsrq5A/Ul8kZbtk/PExMiyiQVjDNvXUTAH92UbR/6/PFSFJ74/cTj6BbCN/AAqxKqT7JZuHX0d4YE1sMm78WQHIze3eXhkBuqRFt6Lb5EsVHBeYONp1kYOe7TiN39DWFkgbG40RpnQ6q7jGPhonMp3TmB6kXuK9mhT3S1PC/SivrqxQlClPkxfSzNuuK6Cl5PeXjhgmT5LA2RYH0ESQEzDGe+yfwnoK7i2ygahJ4LWIGSguVdAuKXROuMny3eVV50b8lB3SZ2snCGAqkssBttkS5FvV/0Sn u73WA820 3TM1di2SVb2DPU0t3TJCRBzBgd14JXPskbAamrgp/UBp9kgsRyQM7xc6TOhlkUxuZEHjjsYtl8+GnwPwoMrk6q1SPWk73RHYWvoS6SFyx/6oXpR87hPtvEm6ycs/qnjzVUms85PuAdiw/mKwGFsinJsiVQGhy6/Z4jrALMugIuPkI1XQ60nWe1tQMfT3tC4TwEljrBvSjprdGgmlCDbL0ij+RzmT/rU5o2mhfFzn4hfU74QGIbjHq14MhpY3kqKa0KKZj0HfiMSxRbRTRyoiUEXDb4YNkRSxNr9qmtYH8wbnoxJ/d33cliMWbpA8JHMnGg/mfp7gP9870u8fUuUytRSg63x2JouEsQpobBLry4e1lFx0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This is a demonstration of kstate capabilities to migrate across kexec something more complex rather than simple structure like in the lib/test_kstate.c module. Here we migrate tracing_on/current_trace and content of the trace buffers to the new kernel The 'global_trace_state' describes 'tracing_on' and 'current_trace' states. The 'trace_buffer' kstate field in 'global_trace_state' points to 'kstate_trace_buffer' describing the state of ring buffers. The code in kstate_rb_[save/restore]() saves and restore list of buffer pages. It turned out to be somewhat hacky and ugly, partially because kstate currently can't migrate slab data. So because of that we have to save/restore positions of commit_page/reader_page/etc in the list of pages. We could probably teach kstate to migrate slab pages, preserving contents at the same address, which would make easier to migrate lists like the ring buffer list in the trace, as we would need to save/restore only pointer. Signed-off-by: Andrey Ryabinin --- include/linux/kstate.h | 4 + kernel/trace/ring_buffer.c | 189 +++++++++++++++++++++++++++++++++++++ kernel/trace/trace.c | 81 ++++++++++++++++ 3 files changed, 274 insertions(+) diff --git a/include/linux/kstate.h b/include/linux/kstate.h index 2ddbe41a1f171..ae807a75a02f8 100644 --- a/include/linux/kstate.h +++ b/include/linux/kstate.h @@ -32,6 +32,10 @@ enum kstate_ids { KSTATE_PAGE_ID, KSTATE_RSVD_MEM_ID, KSTATE_TEST_ID, + KSTATE_TRACE_ID, + KSTATE_TRACE_BUFFER_ID, + KSTATE_TRACE_RING_BUFFER_ID, + KSTATE_TRACE_BUFFER_PAGE_ID, KSTATE_LAST_ID = -1, }; diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 77dc0b25140e6..9a8692d7d960c 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -16,6 +16,7 @@ #include #include #include +#include #include /* for self test */ #include #include @@ -1467,6 +1468,194 @@ static void rb_tail_page_update(struct ring_buffer_per_cpu *cpu_buffer, } } +#ifdef CONFIG_KSTATE +static int kstate_bpage_save(void *mig_stream, void *obj, const struct kstate_field *field) +{ + struct buffer_page *bpage = obj; + + kstate_register_page(virt_to_page(bpage->page), bpage->order); + return 0; + +} +struct kstate_description kstate_buffer_page = { + .name = "buffer_page", + .id = KSTATE_TRACE_BUFFER_PAGE_ID, + .fields = (const struct kstate_field[]) { + KSTATE_SIMPLE(write, struct buffer_page), + KSTATE_SIMPLE(read, struct buffer_page), + KSTATE_SIMPLE(entries, struct buffer_page), + KSTATE_SIMPLE(real_end, struct buffer_page), + KSTATE_SIMPLE(order, struct buffer_page), + KSTATE_SIMPLE(page, struct buffer_page), + { + .name = "buffer_page", + .flags = KS_CUSTOM, + .save = kstate_bpage_save, + .size = (sizeof(struct buffer_page)), + }, + KSTATE_END_OF_LIST(), + }, +}; + +static void restore_pages_positions(void **mig_stream, + struct ring_buffer_per_cpu *cpu_buffer) +{ + struct list_head *tmp; + struct list_head *head = rb_list_head(cpu_buffer->pages); + unsigned long commit_page_nr, reader_page_nr, + head_page_nr, tail_page_nr; + int i = 0; + + commit_page_nr = kstate_get_ulong(mig_stream); + reader_page_nr = kstate_get_ulong(mig_stream); + head_page_nr = kstate_get_ulong(mig_stream); + tail_page_nr = kstate_get_ulong(mig_stream); + + for (tmp = head;;) { + struct buffer_page *page = (struct buffer_page *)tmp; + + if (commit_page_nr == i) + cpu_buffer->commit_page = page; + if (reader_page_nr == i) + cpu_buffer->reader_page = page; + if (head_page_nr == i) + cpu_buffer->head_page = page; + if (tail_page_nr == i) + cpu_buffer->tail_page = page; + i++; + tmp = rb_list_head(tmp->next); + if (tmp == head) + break; + } +} + +static int kstate_rb_restore(void *mig_stream, void *obj, + const struct kstate_field *field) +{ + struct ring_buffer_per_cpu *cpu_buffer = obj; + LIST_HEAD(pages); + void *stream_start = mig_stream; + struct buffer_page *page; + struct list_head *tmp; + struct list_head *head = rb_list_head(cpu_buffer->pages); + int i = 0; + + while (kstate_get_byte(&mig_stream)) { + int j = 0; + bool page_exists = false; + + for (tmp = rb_list_head(head->next); tmp != head; + tmp = rb_list_head(tmp->next)) { + if (j == i) { + page_exists = true; + page = (struct buffer_page *)tmp; + break; + } + j++; + } + if (!page_exists) { + struct buffer_page *bpage; + + bpage = kzalloc_node(ALIGN(sizeof(*bpage), + cache_line_size()), GFP_KERNEL, + cpu_to_node(cpu_buffer->cpu)); + list_add(&bpage->list, &pages); + page = bpage; + } + mig_stream = restore_kstate((struct kstate_entry *)mig_stream, + i++, field->ksd, page); + } + + restore_pages_positions(&mig_stream, cpu_buffer); + + return mig_stream - stream_start; +} + +static int kstate_rb_save(void *mig_stream, void *obj, + const struct kstate_field *field) +{ + struct ring_buffer_per_cpu *cpu_buffer = obj; + struct list_head *tmp; + struct list_head *head = rb_list_head(cpu_buffer->pages); + void *stream_start = mig_stream; + unsigned long commit_page_nr, reader_page_nr, + head_page_nr, tail_page_nr; + int i = 0; + + + for (tmp = head;;) { + struct buffer_page *page = (struct buffer_page *)tmp; + + mig_stream = kstate_save_byte(mig_stream, 1); + mig_stream = save_kstate(mig_stream, i, field->ksd, page); + + if (cpu_buffer->commit_page == page) + commit_page_nr = i; + if (cpu_buffer->reader_page == page) + reader_page_nr = i; + if (cpu_buffer->head_page == page) + head_page_nr = i; + if (cpu_buffer->tail_page == page) + tail_page_nr = i; + i++; + tmp = rb_list_head(tmp->next); + if (tmp == head) + break; + } + + mig_stream = kstate_save_byte(mig_stream, 0); + + /* save pages positions */ + mig_stream = kstate_save_ulong(mig_stream, commit_page_nr); + mig_stream = kstate_save_ulong(mig_stream, reader_page_nr); + mig_stream = kstate_save_ulong(mig_stream, head_page_nr); + mig_stream = kstate_save_ulong(mig_stream, tail_page_nr); + + return mig_stream - stream_start; +} + +struct kstate_description kstate_ring_buffer_per_cpu = { + .name = "ring_buffer_per_cpu", + .id = KSTATE_TRACE_RING_BUFFER_ID, + .state_list = LIST_HEAD_INIT(kstate_ring_buffer_per_cpu.state_list), + .fields = (const struct kstate_field[]) { + KSTATE_SIMPLE(entries, struct ring_buffer_per_cpu), + KSTATE_SIMPLE(entries_bytes, struct ring_buffer_per_cpu), + { + .name = "buffer_pages", + .flags = KS_CUSTOM, + .size = (sizeof(struct ring_buffer_per_cpu)), + .ksd = &kstate_buffer_page, + .save = kstate_rb_save, + .restore = kstate_rb_restore, + }, + KSTATE_END_OF_LIST(), + }, +}; + +static int nr_ring_buffers(void) +{ + return nr_cpu_ids; +} + +struct kstate_description kstate_trace_buffer = { + .name = "trace_buffer", + .id = KSTATE_TRACE_BUFFER_ID, + .state_list = LIST_HEAD_INIT(kstate_trace_buffer.state_list), + .fields = (const struct kstate_field[]) { + { + .name = "ring_buffers", + .flags = KS_STRUCT|KS_POINTER|KS_ARRAY_OF_POINTER, + .size = (sizeof(struct ring_buffer_per_cpu *)), + .offset = offsetof(struct trace_buffer, buffers), + .count = nr_ring_buffers, + .ksd = &kstate_ring_buffer_per_cpu, + }, + KSTATE_END_OF_LIST(), + } +}; +#endif + static void rb_check_bpage(struct ring_buffer_per_cpu *cpu_buffer, struct buffer_page *bpage) { diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index c01375adc4714..bb07d716beab4 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include @@ -10621,6 +10622,84 @@ __init static void enable_instances(void) } } +#ifdef CONFIG_KSTATE +static int cur_trace_save(void *mig_stream, void *obj, + const struct kstate_field *field) +{ + struct trace_array *tr = obj; + + return strscpy(mig_stream, tr->current_trace->name, 100) + 1; +} + +static int cur_trace_restore(void *mig_stream, void *obj, + const struct kstate_field *field) +{ + struct trace_array *tr = obj; + + tracing_set_tracer(tr, mig_stream); + return strlen(mig_stream) + 1; +} + +static int tracing_on_save(void *mig_stream, void *obj, + const struct kstate_field *field) +{ + struct trace_array *tr = obj; + + *(u8 *)mig_stream = (u8)tracer_tracing_is_on(tr); + return sizeof(u8); + +} + +static int tracing_on_restore(void *mig_stream, void *obj, + const struct kstate_field *field) +{ + struct trace_array *tr = obj; + u8 on = *(u8 *)mig_stream; + + if (on) + tracer_tracing_on(tr); + else + tracer_tracing_off(tr); + + return sizeof(on); +} + +extern struct kstate_description kstate_trace_buffer; + +struct kstate_description global_trace_state = { + .name = "trace_state", + .id = KSTATE_TRACE_ID, + .version_id = 1, + .state_list = LIST_HEAD_INIT(global_trace_state.state_list), + .fields = (const struct kstate_field[]) { + { + .name = "tracing_on", + .flags = KS_CUSTOM, + .version_id = 0, + .size = sizeof(struct trace_array), + .save = tracing_on_save, + .restore = tracing_on_restore, + }, + { + .name = "current_trace", + .flags = KS_CUSTOM, + .version_id = 0, + .size = sizeof(struct trace_array), + .save = cur_trace_save, + .restore = cur_trace_restore, + + }, + { + .name = "trace_buffer", + .flags = KS_STRUCT|KS_POINTER, + .offset = offsetof(struct trace_array, array_buffer.buffer), + .ksd = &kstate_trace_buffer, + }, + KSTATE_END_OF_LIST() + }, +}; +#endif + __init static int tracer_alloc_buffers(void) { int ring_buf_size; @@ -10848,6 +10927,8 @@ __init static int late_trace_init(void) tracing_set_default_clock(); clear_boot_tracer(); + kstate_register(&global_trace_state, &global_trace); + return 0; }