diff mbox series

[v2,24/35] xen/console: introduce hwdom_crashconsole=

Message ID 20241205-vuart-ns8250-v1-24-e9aa923127eb@ford.com (mailing list archive)
State New
Headers show
Series Introduce NS8250 UART emulator | expand

Commit Message

Denis Mukhin via B4 Relay Dec. 6, 2024, 4:41 a.m. UTC
From: Denis Mukhin <dmukhin@ford.com>

The new command line switch `hwdom_crashconsole=BOOL` allows to switch serial
console input focus to xen for machine state inspection using keyhandler
mechanism after the hardware domain crashes.

The new command line switch is aliased via `dom0=...,crashconsole` knob.

Such functionality can be useful while debugging dom0 bringup.

Signed-off-by: Denis Mukhin <dmukhin@ford.com>
---
 docs/misc/xen-command-line.pandoc |  5 +++++
 xen/arch/x86/dom0_build.c         |  2 ++
 xen/common/domain.c               | 14 +++++++++++++-
 xen/include/xen/domain.h          |  1 +
 4 files changed, 21 insertions(+), 1 deletion(-)

Comments

Roger Pau Monné Dec. 12, 2024, 12:29 p.m. UTC | #1
On Thu, Dec 05, 2024 at 08:41:54PM -0800, Denis Mukhin via B4 Relay wrote:
> From: Denis Mukhin <dmukhin@ford.com>
> 
> The new command line switch `hwdom_crashconsole=BOOL` allows to switch serial
> console input focus to xen for machine state inspection using keyhandler
> mechanism after the hardware domain crashes.
> 
> The new command line switch is aliased via `dom0=...,crashconsole` knob.
> 
> Such functionality can be useful while debugging dom0 bringup.
> 
> Signed-off-by: Denis Mukhin <dmukhin@ford.com>
> ---
>  docs/misc/xen-command-line.pandoc |  5 +++++
>  xen/arch/x86/dom0_build.c         |  2 ++
>  xen/common/domain.c               | 14 +++++++++++++-
>  xen/include/xen/domain.h          |  1 +
>  4 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
> index 293dbc1a957ba6e668fd4d55d58e84f643822126..fb77d7dca1ea517f79d6713aa6909422f31e7724 100644
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -806,6 +806,7 @@ Specify the bit width of the DMA heap.
>  
>  ### dom0
>      = List of [ pv | pvh, shadow=<bool>, verbose=<bool>,
> +                crashconsole=<bool>,
>                  cpuid-faulting=<bool>, msr-relaxed=<bool> ] (x86)
>  
>      = List of [ sve=<integer> ] (Arm64)
> @@ -839,6 +840,10 @@ Controls for how dom0 is constructed on x86 systems.
>      information during the dom0 build.  It defaults to the compile time choice
>      of `CONFIG_VERBOSE_DEBUG`.
>  
> +*   The `crashconsole` boolean instructs Xen to drop into emergency console
> +    in case of dom0 crash. May be useful for dom0 bringup on a custom

I think the 'a' is unneeded -> "on custom hardware."

I think however this would be clearer as:

"The `crashconsole` boolean instructs Xen to switch input console
focus to the hypervisor when dom0 shuts down and avoid performing
dom0 domain destruction.  Should only be used for debugging
purposes."

It's IMO not clear what "instructs Xen to drop into emergency console"
implies for Xen.

> +    hardware.
> +
>  *   The `cpuid-faulting` boolean is an interim option, is only applicable to
>      PV dom0, and defaults to true.
>  
> diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
> index e8f5bf5447bc47a6daa3d95787106a4c11e80d31..706aeec0ecbb565a415edbfb33ca2fd72967c560 100644
> --- a/xen/arch/x86/dom0_build.c
> +++ b/xen/arch/x86/dom0_build.c
> @@ -286,6 +286,8 @@ int __init parse_arch_dom0_param(const char *s, const char *e)
>          opt_dom0_cpuid_faulting = val;
>      else if ( (val = parse_boolean("msr-relaxed", s, e)) >= 0 )
>          opt_dom0_msr_relaxed = val;
> +    else if ( (val = parse_boolean("crashconsole", s, e)) >= 0 )
> +        opt_hwdom_crashconsole = !!val;
>      else
>          return -EINVAL;
>  
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index aab546c0a8535e4f007cbbc9c5c552bcf66b5807..4fe69f294158dda7b2e0b9d98d49c34e04131cb8 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -56,6 +56,13 @@ unsigned int xen_processor_pmbits = XEN_PROCESSOR_PM_PX;
>  bool opt_dom0_vcpus_pin;
>  boolean_param("dom0_vcpus_pin", opt_dom0_vcpus_pin);
>  
> +/*
> + * Hardware domain crash handler: if true, do not halt machine, but switch to
> + * Xen console for debugging.
> + */
> +bool opt_hwdom_crashconsole;

__ro_after_init.

> +boolean_param("hwdom_crashconsole", opt_hwdom_crashconsole);

This option doesn't seem to be documented at all in
xen-command-line.pandoc?

> +
>  /* Protect updates/reads (resp.) of domain_list and domain_hash. */
>  DEFINE_SPINLOCK(domlist_update_lock);
>  DEFINE_RCU_READ_LOCK(domlist_read_lock);
> @@ -1138,7 +1145,12 @@ int domain_shutdown(struct domain *d, u8 reason)
>      reason = d->shutdown_code;
>  
>      if ( is_hardware_domain(d) )
> -        hwdom_shutdown(reason);
> +    {
> +        if ( opt_hwdom_crashconsole )
> +            console_set_owner(DOMID_XEN);

Don't you need to pause all domain vCPUs and return early here to
avoid executing the rest of the function, that will likely destroy the
domain?

Maybe there's something I'm missing that prevents the hardware domain
destruction.

Thanks, Roger.
diff mbox series

Patch

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 293dbc1a957ba6e668fd4d55d58e84f643822126..fb77d7dca1ea517f79d6713aa6909422f31e7724 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -806,6 +806,7 @@  Specify the bit width of the DMA heap.
 
 ### dom0
     = List of [ pv | pvh, shadow=<bool>, verbose=<bool>,
+                crashconsole=<bool>,
                 cpuid-faulting=<bool>, msr-relaxed=<bool> ] (x86)
 
     = List of [ sve=<integer> ] (Arm64)
@@ -839,6 +840,10 @@  Controls for how dom0 is constructed on x86 systems.
     information during the dom0 build.  It defaults to the compile time choice
     of `CONFIG_VERBOSE_DEBUG`.
 
+*   The `crashconsole` boolean instructs Xen to drop into emergency console
+    in case of dom0 crash. May be useful for dom0 bringup on a custom
+    hardware.
+
 *   The `cpuid-faulting` boolean is an interim option, is only applicable to
     PV dom0, and defaults to true.
 
diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index e8f5bf5447bc47a6daa3d95787106a4c11e80d31..706aeec0ecbb565a415edbfb33ca2fd72967c560 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -286,6 +286,8 @@  int __init parse_arch_dom0_param(const char *s, const char *e)
         opt_dom0_cpuid_faulting = val;
     else if ( (val = parse_boolean("msr-relaxed", s, e)) >= 0 )
         opt_dom0_msr_relaxed = val;
+    else if ( (val = parse_boolean("crashconsole", s, e)) >= 0 )
+        opt_hwdom_crashconsole = !!val;
     else
         return -EINVAL;
 
diff --git a/xen/common/domain.c b/xen/common/domain.c
index aab546c0a8535e4f007cbbc9c5c552bcf66b5807..4fe69f294158dda7b2e0b9d98d49c34e04131cb8 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -56,6 +56,13 @@  unsigned int xen_processor_pmbits = XEN_PROCESSOR_PM_PX;
 bool opt_dom0_vcpus_pin;
 boolean_param("dom0_vcpus_pin", opt_dom0_vcpus_pin);
 
+/*
+ * Hardware domain crash handler: if true, do not halt machine, but switch to
+ * Xen console for debugging.
+ */
+bool opt_hwdom_crashconsole;
+boolean_param("hwdom_crashconsole", opt_hwdom_crashconsole);
+
 /* Protect updates/reads (resp.) of domain_list and domain_hash. */
 DEFINE_SPINLOCK(domlist_update_lock);
 DEFINE_RCU_READ_LOCK(domlist_read_lock);
@@ -1138,7 +1145,12 @@  int domain_shutdown(struct domain *d, u8 reason)
     reason = d->shutdown_code;
 
     if ( is_hardware_domain(d) )
-        hwdom_shutdown(reason);
+    {
+        if ( opt_hwdom_crashconsole )
+            console_set_owner(DOMID_XEN);
+        else
+            hwdom_shutdown(reason);
+    }
 
     if ( d->is_shutting_down )
     {
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index 6102826a929ff7aad58a4bc40974815071a97446..e0c52af878b69e28e2d19957d0d3a8234860ecba 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -150,6 +150,7 @@  extern unsigned int xen_processor_pmbits;
 extern bool opt_dom0_vcpus_pin;
 extern cpumask_t dom0_cpus;
 extern bool dom0_affinity_relaxed;
+extern bool opt_hwdom_crashconsole;
 
 /* vnuma topology per domain. */
 struct vnuma_info {