From patchwork Thu Feb 6 13:27:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Rapoport X-Patchwork-Id: 13963087 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E95CC02194 for ; Thu, 6 Feb 2025 13:29:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 93AE1280007; Thu, 6 Feb 2025 08:29:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8EA92280005; Thu, 6 Feb 2025 08:29:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 78B1C280007; Thu, 6 Feb 2025 08:29:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 595CB280005 for ; Thu, 6 Feb 2025 08:29:43 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id BF68C1C8E34 for ; Thu, 6 Feb 2025 13:29:35 +0000 (UTC) X-FDA: 83089601910.02.8283DD2 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf19.hostedemail.com (Postfix) with ESMTP id 21DB41A0042 for ; Thu, 6 Feb 2025 13:29:33 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=LI5Cnhhe; spf=pass (imf19.hostedemail.com: domain of rppt@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738848574; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dA6iBQ5oZZOPe8K34gSac4Lm6LFXcMcJE6yIUoz0bps=; b=L/gfjhksHn9ZalEbI5h43m3656JFlPJBxCvagZH/kCVWi1FXNLznMFDUuaUEG8fnB0rDcw 7q4EbNWPzPzACmmtcGD0BYvW26aSrFtn9YGnd3AAjxuLy6/f0vKwQq0KzivUFadBpf6W4F EcH2fOdq+vnkv51YESJkPYo6Xan0Y34= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738848574; a=rsa-sha256; cv=none; b=7oU6C/Xnv+/j0RS2iesuIJKBNbB4RJvPKY3wRUe90u6ampt8SrFVmMH5tX1upgfhqj5ARr ctfP1RhDIHxyGkp6mLWLaRwlWJjXROWDou9xLgp33KVsB8gMpalQgbb4Q/P4oDWnNIl51W oGh2212u5+hWKjGJb7R7cdEtqfj0I+c= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=LI5Cnhhe; spf=pass (imf19.hostedemail.com: domain of rppt@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=rppt@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 8ABAAA43F8C; Thu, 6 Feb 2025 13:27:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B66DCC4CEDD; Thu, 6 Feb 2025 13:29:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1738848573; bh=ke287oo305Uam86hfMT5albHkTPGceT7i4ZdUVTV7qs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LI5CnhheIluEf6E0yT0RtPvHaYAUSlrYt6PYxNsdXVlD1mnmeL23vT7jUPIcsCGH7 7bROZIcwG83GSfk1kdHyjFtwYBfJMvN0yFn2dVTzscgnvSojIdtykNyLALyBPdYQdU K0Bch5vpDoeXCbdMAZmaxmA+HlNug5/HmbekVQPF+a4OKOfVqgbaN0WXdL/xm10aMu 1y64lYXF1pTud4pkooN5cf/H/xqxZuWY6FVPvRyzqYSrU+K/oMXuPZwJhKinovYfIt QPlHJFDiVyKKaP5GFcG/nG7Q6qzYqH5a9SjjmsXrnhE6IsSJfon/X3ZQz2lzG9adOL g2XwZIlLqNlKw== From: Mike Rapoport To: linux-kernel@vger.kernel.org Cc: Alexander Graf , Andrew Morton , Andy Lutomirski , Anthony Yznaga , Arnd Bergmann , Ashish Kalra , Benjamin Herrenschmidt , Borislav Petkov , Catalin Marinas , Dave Hansen , David Woodhouse , Eric Biederman , Ingo Molnar , James Gowans , Jonathan Corbet , Krzysztof Kozlowski , Mark Rutland , Mike Rapoport , Paolo Bonzini , Pasha Tatashin , "H. Peter Anvin" , Peter Zijlstra , Pratyush Yadav , Rob Herring , Rob Herring , Saravana Kannan , Stanislav Kinsburskii , Steven Rostedt , Thomas Gleixner , Tom Lendacky , Usama Arif , Will Deacon , devicetree@vger.kernel.org, kexec@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org Subject: [PATCH v4 09/14] kexec: Add documentation for KHO Date: Thu, 6 Feb 2025 15:27:49 +0200 Message-ID: <20250206132754.2596694-10-rppt@kernel.org> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250206132754.2596694-1-rppt@kernel.org> References: <20250206132754.2596694-1-rppt@kernel.org> MIME-Version: 1.0 X-Rspamd-Queue-Id: 21DB41A0042 X-Stat-Signature: zemchfy6tfwqwra9j113bc9wjh7q4yng X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1738848573-418144 X-HE-Meta: U2FsdGVkX1+PU13IFBcJU8Uw6lhnOH35Tjl+x1M/bPYyAtLBmRb/wihXMmX3gq7N/OdTmPDeT7Pd1eBeBF8wmTIXf86LK4xwk5J1guYHzl+zg9/WXfZSZhd66vliiOOT6OcjeVVCMWZkoFfm3OtEDEhkKIBoxMNNSwIYuAibJB6kq1WC9b0aTEVsp1Vt2A41+swDVjh6Jog+EHduu0nfLUoyUPUbiK3ROLWp54PqYk5Sj8YVOX6P2MSycAvC8lm/dZDmMPuUWchSXBDAKMWLregOvrwBeTiFcht9SqtEf2ay/A01OadPly8uQOaNqM6GWhpJGEErbWYtup6QLLr3SGOwMoUlatXcvL2ZJTd4T5b+SCUPnXx9AeLhcRD9OkAkHfU8sfly2w51qXRe+AtjvKWCwFt1bq8WfGEOiH3BtUdDqmIFoMlz5qsIdvpAcMtNnB5SEYuZ4J0bgZPRQhOHEwJdy1v+RZnclM3GWN65Lv183Wkc+Me2nCDDjRN/lrTCvmd2cyH3VP49rr2pDIAQjcxnn0nyewJ11KXlEuKWddSgVhfGLmdHZDeHxDt/+qNU/a484gCuFYu05R1iKQPvMk1AgHcNMlv7u4I7y24hQRGU3jIor9qFRJdvAEqOjgqXJxS38lErSQVmYcswmHS0GCVel2vrTc7ISmzijbvuqQTLumQgz8eZLvT+pL5DKCIeu2Kxm7pH/05u5z4xSlHfpvj+9NVyqMwpy7Gha0UDgeBtNnupp14tspxnbeuuXh34XboKr2qeDxYKFR84dwH2vG4tT46CnUEkirxIHC8PV+hUP45VOFmX6Zyw9I70UgYMtK647Uvg2JkNxfX0m9fJD7xpm9CrXANvHQvzdo49+uXkm2b72WQcJbmGc7gtoNLiFJJSk3j03wZ5Oxa2nmCCUx9gKytvXKdc1A6ezdnhq9PPmFz3a0Yuf5wL7kMdXnh6ovkOq14KQlychn5boTR MT4Iz0v2 Z2t7NaBoQy9Um+XSPEnwg2M6qpZjzbs22VZ/e0xJEPyY6zUycOL0N8Q0A24+LbnHBQEcn6j2B6WbLNVSvOsffItFSRgHJDnIq1HHLId5hQPBDyM5fUyWq2dEBImzw+CM5dyp63eFyXEuQYAnamxI9HQ0ih2FZB8sBA+82fZoU/RJ6p42nYW8LO3m9UIj0J2OSel+Tnm21PwRRy7W9qQj7UhoH/P36ZR0ky/F43UvRF0asybvNDwLvodPST9W9rYKo9ZGUDTVjRN4nrp/ZuWeBvWGqL2jMTKYLQ3h0Oc19J9gqnG1FwMCbSAoy0oQYLr5OPeDH8tTkAurgMXg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Alexander Graf With KHO in place, let's add documentation that describes what it is and how to use it. Signed-off-by: Alexander Graf Co-developed-by: Mike Rapoport (Microsoft) Signed-off-by: Mike Rapoport (Microsoft) --- Documentation/kho/concepts.rst | 80 ++++++++++++++++++++++++++++++++ Documentation/kho/index.rst | 19 ++++++++ Documentation/kho/usage.rst | 60 ++++++++++++++++++++++++ Documentation/subsystem-apis.rst | 1 + MAINTAINERS | 1 + 5 files changed, 161 insertions(+) create mode 100644 Documentation/kho/concepts.rst create mode 100644 Documentation/kho/index.rst create mode 100644 Documentation/kho/usage.rst diff --git a/Documentation/kho/concepts.rst b/Documentation/kho/concepts.rst new file mode 100644 index 000000000000..232bddacc0ef --- /dev/null +++ b/Documentation/kho/concepts.rst @@ -0,0 +1,80 @@ +.. SPDX-License-Identifier: GPL-2.0-or-later + +======================= +Kexec Handover Concepts +======================= + +Kexec HandOver (KHO) is a mechanism that allows Linux to preserve state - +arbitrary properties as well as memory locations - across kexec. + +It introduces multiple concepts: + +KHO Device Tree +--------------- + +Every KHO kexec carries a KHO specific flattened device tree blob that +describes the state of the system. Device drivers can register to KHO to +serialize their state before kexec. After KHO, device drivers can read +the device tree and extract previous state. + +KHO only uses the fdt container format and libfdt library, but does not +adhere to the same property semantics that normal device trees do: Properties +are passed in native endianness and standardized properties like ``regs`` and +``ranges`` do not exist, hence there are no ``#...-cells`` properties. + +KHO introduces a new concept to its device tree: ``mem`` properties. A +``mem`` property can be inside any subnode in the device tree. When present, +it contains an array of physical memory ranges that the new kernel must mark +as reserved on boot. It is recommended, but not required, to make these ranges +as physically contiguous as possible to reduce the number of array elements :: + + struct kho_mem { + __u64 addr; + __u64 len; + }; + +After boot, drivers can call the kho subsystem to transfer ownership of memory +that was reserved via a ``mem`` property to themselves to continue using memory +from the previous execution. + +The KHO device tree follows the in-Linux schema requirements. Any element in +the device tree is documented via device tree schema yamls that explain what +data gets transferred. + +Scratch Regions +--------------- + +To boot into kexec, we need to have a physically contiguous memory range that +contains no handed over memory. Kexec then places the target kernel and initrd +into that region. The new kernel exclusively uses this region for memory +allocations before during boot up to the initialization of the page allocator. + +We guarantee that we always have such regions through the scratch regions: On +first boot KHO allocates several physically contiguous memory regions. Since +after kexec these regions will be used by early memory allocations, there is a +scratch region per NUMA node plus a scratch region to satisfy allocations +requests that do not require particilar NUMA node assignment. +By default, size of the scratch region is calculated based on amount of memory +allocated during boot. The ``kho_scratch`` kernel command line option may be used to explicitly define size of the scratch regions. +The scratch regions are declared as CMA when page allocator is initialized so +that their memory can be used during system lifetime. CMA gives us the +guarantee that no handover pages land in that region, because handover pages +must be at a static physical memory location and CMA enforces that only +movable pages can be located inside. + +After KHO kexec, we ignore the ``kho_scratch`` kernel command line option and +instead reuse the exact same region that was originally allocated. This allows +us to recursively execute any amount of KHO kexecs. Because we used this region +for boot memory allocations and as target memory for kexec blobs, some parts +of that memory region may be reserved. These reservations are irrenevant for +the next KHO, because kexec can overwrite even the original kernel. + +KHO active phase +---------------- + +To enable user space based kexec file loader, the kernel needs to be able to +provide the device tree that describes the previous kernel's state before +performing the actual kexec. The process of generating that device tree is +called serialization. When the device tree is generated, some properties +of the system may become immutable because they are already written down +in the device tree. That state is called the KHO active phase. diff --git a/Documentation/kho/index.rst b/Documentation/kho/index.rst new file mode 100644 index 000000000000..5e7eeeca8520 --- /dev/null +++ b/Documentation/kho/index.rst @@ -0,0 +1,19 @@ +.. SPDX-License-Identifier: GPL-2.0-or-later + +======================== +Kexec Handover Subsystem +======================== + +.. toctree:: + :maxdepth: 1 + + concepts + usage + +.. only:: subproject and html + + + Indices + ======= + + * :ref:`genindex` diff --git a/Documentation/kho/usage.rst b/Documentation/kho/usage.rst new file mode 100644 index 000000000000..e7300fbb309c --- /dev/null +++ b/Documentation/kho/usage.rst @@ -0,0 +1,60 @@ +.. SPDX-License-Identifier: GPL-2.0-or-later + +==================== +Kexec Handover Usage +==================== + +Kexec HandOver (KHO) is a mechanism that allows Linux to preserve state - +arbitrary properties as well as memory locations - across kexec. + +This document expects that you are familiar with the base KHO +:ref:`Documentation/kho/concepts.rst `. If you have not read +them yet, please do so now. + +Prerequisites +------------- + +KHO is available when the ``CONFIG_KEXEC_HANDOVER`` config option is set to y +at compile time. Every KHO producer may have its own config option that you +need to enable if you would like to preserve their respective state across +kexec. + +To use KHO, please boot the kernel with the ``kho=on`` command line +parameter. You may use ``kho_scratch`` parameter to define size of the +scratch regions. For example ``kho_scratch=512M,512M`` will reserve a 512 +MiB for a global scratch region and 512 MiB per NUMA node scratch regions +on boot. + +Perform a KHO kexec +------------------- + +Before you can perform a KHO kexec, you need to move the system into the +:ref:`Documentation/kho/concepts.rst ` :: + + $ echo 1 > /sys/kernel/kho/active + +After this command, the KHO device tree is available in ``/sys/kernel/kho/dt``. + +Next, load the target payload and kexec into it. It is important that you +use the ``-s`` parameter to use the in-kernel kexec file loader, as user +space kexec tooling currently has no support for KHO with the user space +based file loader :: + + # kexec -l Image --initrd=initrd -s + # kexec -e + +The new kernel will boot up and contain some of the previous kernel's state. + +For example, if you used ``reserve_mem`` command line parameter to create +an early memory reservation, the new kernel will have that memory at the +same physical address as the old kernel. + +Abort a KHO exec +---------------- + +You can move the system out of KHO active phase again by calling :: + + $ echo 1 > /sys/kernel/kho/active + +After this command, the KHO device tree is no longer available in +``/sys/kernel/kho/dt``. diff --git a/Documentation/subsystem-apis.rst b/Documentation/subsystem-apis.rst index b52ad5b969d4..5fc69d6ff9f0 100644 --- a/Documentation/subsystem-apis.rst +++ b/Documentation/subsystem-apis.rst @@ -90,3 +90,4 @@ Other subsystems peci/index wmi/index tee/index + kho/index diff --git a/MAINTAINERS b/MAINTAINERS index e1e01b2a3727..82c2ef421c00 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -12828,6 +12828,7 @@ S: Maintained W: http://kernel.org/pub/linux/utils/kernel/kexec/ F: Documentation/ABI/testing/sysfs-firmware-kho F: Documentation/ABI/testing/sysfs-kernel-kho +F: Documentation/kho/ F: include/linux/kexec.h F: include/uapi/linux/kexec.h F: kernel/kexec*