diff mbox series

[v7,1/2] x86/time: introduce command line option to select wallclock

Message ID 20240913075907.34460-2-roger.pau@citrix.com (mailing list archive)
State New
Headers show
Series x86/time: improvements to wallclock logic | expand

Commit Message

Roger Pau Monné Sept. 13, 2024, 7:59 a.m. UTC
Allow setting the used wallclock from the command line.  When the option is set
to a value different than `auto` the probing is bypassed and the selected
implementation is used (as long as it's available).

The `xen` and `efi` options require being booted as a Xen guest (with Xen guest
supported built-in) or from UEFI firmware respectively.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v6:
 - Clarify documentation regarding repeated using of the wallclock command line
   option.

Changes since v5:
 - Do EFI run-time services checking after command line parsing.

Changes since v3:
 - Note xen option is only available if Xen guest support it built.
 - Fix typo.
---
 docs/misc/xen-command-line.pandoc | 21 ++++++++++++++++
 xen/arch/x86/time.c               | 41 ++++++++++++++++++++++++++++++-
 2 files changed, 61 insertions(+), 1 deletion(-)

Comments

Jan Beulich Sept. 13, 2024, 12:38 p.m. UTC | #1
On 13.09.2024 09:59, Roger Pau Monne wrote:
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -1552,6 +1552,37 @@ static const char *__init wallclock_type_to_string(void)
>      return "";
>  }
>  
> +static int __init cf_check parse_wallclock(const char *arg)
> +{
> +    wallclock_source = WALLCLOCK_UNSET;

With this ...

> +    if ( !arg )
> +        return -EINVAL;
> +
> +    if ( !strcmp("auto", arg) )
> +        ASSERT(wallclock_source == WALLCLOCK_UNSET);

... I'm not convinced this is (still) needed.

Jan
Roger Pau Monné Sept. 16, 2024, 7:50 a.m. UTC | #2
On Fri, Sep 13, 2024 at 02:38:14PM +0200, Jan Beulich wrote:
> On 13.09.2024 09:59, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/time.c
> > +++ b/xen/arch/x86/time.c
> > @@ -1552,6 +1552,37 @@ static const char *__init wallclock_type_to_string(void)
> >      return "";
> >  }
> >  
> > +static int __init cf_check parse_wallclock(const char *arg)
> > +{
> > +    wallclock_source = WALLCLOCK_UNSET;
> 
> With this ...
> 
> > +    if ( !arg )
> > +        return -EINVAL;
> > +
> > +    if ( !strcmp("auto", arg) )
> > +        ASSERT(wallclock_source == WALLCLOCK_UNSET);
> 
> ... I'm not convinced this is (still) needed.

It reduces to an empty statement in release builds, and is IMO clearer
code wise.  I could replace the assert with a comment, but IMO the
assert conveys the same information in a more compact format and
provides extra safety in case the code is changed and wallclock_source
is no longer initialized to the expected value.

Thanks, Roger.
Jan Beulich Sept. 16, 2024, 7:53 a.m. UTC | #3
On 16.09.2024 09:50, Roger Pau Monné wrote:
> On Fri, Sep 13, 2024 at 02:38:14PM +0200, Jan Beulich wrote:
>> On 13.09.2024 09:59, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/time.c
>>> +++ b/xen/arch/x86/time.c
>>> @@ -1552,6 +1552,37 @@ static const char *__init wallclock_type_to_string(void)
>>>      return "";
>>>  }
>>>  
>>> +static int __init cf_check parse_wallclock(const char *arg)
>>> +{
>>> +    wallclock_source = WALLCLOCK_UNSET;
>>
>> With this ...
>>
>>> +    if ( !arg )
>>> +        return -EINVAL;
>>> +
>>> +    if ( !strcmp("auto", arg) )
>>> +        ASSERT(wallclock_source == WALLCLOCK_UNSET);
>>
>> ... I'm not convinced this is (still) needed.
> 
> It reduces to an empty statement in release builds, and is IMO clearer
> code wise.  I could replace the assert with a comment, but IMO the
> assert conveys the same information in a more compact format and
> provides extra safety in case the code is changed and wallclock_source
> is no longer initialized to the expected value.

Well, so be it then.

Jan
Andrew Cooper Sept. 16, 2024, 1:11 p.m. UTC | #4
On 13/09/2024 8:59 am, Roger Pau Monne wrote:
> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
> index 959cf45b55d9..2a9b3b9b8975 100644
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -2816,6 +2816,27 @@ vwfi to `native` reduces irq latency significantly. It can also lead to
>  suboptimal scheduling decisions, but only when the system is
>  oversubscribed (i.e., in total there are more vCPUs than pCPUs).
>  
> +### wallclock (x86)
> +> `= auto | xen | cmos | efi`
> +
> +> Default: `auto`
> +
> +Allow forcing the usage of a specific wallclock source.
> +
> + * `auto` let the hypervisor select the clocksource based on internal
> +   heuristics.
> +
> + * `xen` force usage of the Xen shared_info wallclock when booted as a Xen
> +   guest.  This option is only available if the hypervisor was compiled with
> +   `CONFIG_XEN_GUEST` enabled.
> +
> + * `cmos` force usage of the CMOS RTC wallclock.
> +
> + * `efi` force usage of the EFI_GET_TIME run-time method when booted from EFI
> +   firmware.

For both `xen` and `efi`, something should be said about "if selected
and not satisfied, Xen falls back to other heuristics".

> +
> +If the selected option is invalid or not available Xen will default to `auto`.

I'm afraid that I'm firmly of the opinion that "auto" on the cmdline is
unnecessary complexity.  Auto is the default, and doesn't need
specifying explicitly.  That in turn simplifies ...

> +
>  ### watchdog (x86)
>  > `= force | <boolean>`
>  
> diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
> index 29b026735e5d..e4751684951e 100644
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -1552,6 +1552,37 @@ static const char *__init wallclock_type_to_string(void)
>      return "";
>  }
>  
> +static int __init cf_check parse_wallclock(const char *arg)
> +{
> +    wallclock_source = WALLCLOCK_UNSET;
> +
> +    if ( !arg )
> +        return -EINVAL;
> +
> +    if ( !strcmp("auto", arg) )
> +        ASSERT(wallclock_source == WALLCLOCK_UNSET);

... this.

Hitting this assert will manifest as a system reboot/hang with no
information on serial/VGA, because all of this runs prior to getting up
the console.  You only get to see it on a PVH boot because we bodge the
Xen console into default-existence.

So, ASSERT()/etc really need avoiding wherever possible in cmdline parsing.

In this case, all it serves to do is break examples like `wallclock=xen
wallclock=auto` case, which is unlike our latest-takes-precedence
behaviour everywhere else.

> +    else if ( !strcmp("xen", arg) )
> +    {
> +        if ( !xen_guest )

We don't normally treat this path as an error when parsing (we know what
it is, even if we can't action it).  Instead, there's no_config_param()
to be more friendly (for PVH at least).

It's a bit awkward, but this should do:

    {
#ifdef CONFIG_XEN_GUEST
        wallclock_source = WALLCLOCK_XEN;
#else
        no_config_param("XEN_GUEST", "wallclock", s, ss);
#endif
    }

There probably wants to be something similar for EFI, although it's not
a plain CONFIG so it might be more tricky.

~Andrew
Marek Marczykowski-Górecki Sept. 16, 2024, 1:20 p.m. UTC | #5
On Mon, Sep 16, 2024 at 02:11:08PM +0100, Andrew Cooper wrote:
> On 13/09/2024 8:59 am, Roger Pau Monne wrote:
> > diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
> > index 959cf45b55d9..2a9b3b9b8975 100644
> > --- a/docs/misc/xen-command-line.pandoc
> > +++ b/docs/misc/xen-command-line.pandoc
> > @@ -2816,6 +2816,27 @@ vwfi to `native` reduces irq latency significantly. It can also lead to
> >  suboptimal scheduling decisions, but only when the system is
> >  oversubscribed (i.e., in total there are more vCPUs than pCPUs).
> >  
> > +### wallclock (x86)
> > +> `= auto | xen | cmos | efi`
> > +
> > +> Default: `auto`
> > +
> > +Allow forcing the usage of a specific wallclock source.
> > +
> > + * `auto` let the hypervisor select the clocksource based on internal
> > +   heuristics.
> > +
> > + * `xen` force usage of the Xen shared_info wallclock when booted as a Xen
> > +   guest.  This option is only available if the hypervisor was compiled with
> > +   `CONFIG_XEN_GUEST` enabled.
> > +
> > + * `cmos` force usage of the CMOS RTC wallclock.
> > +
> > + * `efi` force usage of the EFI_GET_TIME run-time method when booted from EFI
> > +   firmware.
> 
> For both `xen` and `efi`, something should be said about "if selected
> and not satisfied, Xen falls back to other heuristics".
> 
> > +
> > +If the selected option is invalid or not available Xen will default to `auto`.
> 
> I'm afraid that I'm firmly of the opinion that "auto" on the cmdline is
> unnecessary complexity.  Auto is the default, and doesn't need
> specifying explicitly.  That in turn simplifies ...

What about overriding earlier choice? For example overriding a built-in
cmdline? That said, with the current code, the same can be achieved with
wallclock=foo, and living with the warning in boot log...

> > +
> >  ### watchdog (x86)
> >  > `= force | <boolean>`
> >  
> > diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
> > index 29b026735e5d..e4751684951e 100644
> > --- a/xen/arch/x86/time.c
> > +++ b/xen/arch/x86/time.c
> > @@ -1552,6 +1552,37 @@ static const char *__init wallclock_type_to_string(void)
> >      return "";
> >  }
> >  
> > +static int __init cf_check parse_wallclock(const char *arg)
> > +{
> > +    wallclock_source = WALLCLOCK_UNSET;
> > +
> > +    if ( !arg )
> > +        return -EINVAL;
> > +
> > +    if ( !strcmp("auto", arg) )
> > +        ASSERT(wallclock_source == WALLCLOCK_UNSET);
> 
> ... this.
> 
> Hitting this assert will manifest as a system reboot/hang with no
> information on serial/VGA, because all of this runs prior to getting up
> the console.  You only get to see it on a PVH boot because we bodge the
> Xen console into default-existence.

This assert is no-op as wallclock_source is unconditionally set to WALLCLOCK_UNSET few lines above.

> So, ASSERT()/etc really need avoiding wherever possible in cmdline parsing.
> 
> In this case, all it serves to do is break examples like `wallclock=xen
> wallclock=auto` case, which is unlike our latest-takes-precedence
> behaviour everywhere else.
> 
> > +    else if ( !strcmp("xen", arg) )
> > +    {
> > +        if ( !xen_guest )
> 
> We don't normally treat this path as an error when parsing (we know what
> it is, even if we can't action it).  Instead, there's no_config_param()
> to be more friendly (for PVH at least).
> 
> It's a bit awkward, but this should do:
> 
>     {
> #ifdef CONFIG_XEN_GUEST
>         wallclock_source = WALLCLOCK_XEN;
> #else
>         no_config_param("XEN_GUEST", "wallclock", s, ss);
> #endif
>     }

Can you boot the binary build with CONFIG_XEN_GUEST=y as native? If so,
the above will not be enough, a runtime check is needed anyway.

> There probably wants to be something similar for EFI, although it's not
> a plain CONFIG so it might be more tricky.

It needs to be runtime check here even more. Not only because of
different boot modes, but due to interaction with efi=no-rs (or any
other reason for not having runtime services). See the comment there.
Roger Pau Monné Sept. 16, 2024, 2:14 p.m. UTC | #6
On Mon, Sep 16, 2024 at 03:20:56PM +0200, Marek Marczykowski-Górecki wrote:
> On Mon, Sep 16, 2024 at 02:11:08PM +0100, Andrew Cooper wrote:
> > On 13/09/2024 8:59 am, Roger Pau Monne wrote:
> > > diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
> > > index 959cf45b55d9..2a9b3b9b8975 100644
> > > --- a/docs/misc/xen-command-line.pandoc
> > > +++ b/docs/misc/xen-command-line.pandoc
> > > @@ -2816,6 +2816,27 @@ vwfi to `native` reduces irq latency significantly. It can also lead to
> > >  suboptimal scheduling decisions, but only when the system is
> > >  oversubscribed (i.e., in total there are more vCPUs than pCPUs).
> > >  
> > > +### wallclock (x86)
> > > +> `= auto | xen | cmos | efi`
> > > +
> > > +> Default: `auto`
> > > +
> > > +Allow forcing the usage of a specific wallclock source.
> > > +
> > > + * `auto` let the hypervisor select the clocksource based on internal
> > > +   heuristics.
> > > +
> > > + * `xen` force usage of the Xen shared_info wallclock when booted as a Xen
> > > +   guest.  This option is only available if the hypervisor was compiled with
> > > +   `CONFIG_XEN_GUEST` enabled.
> > > +
> > > + * `cmos` force usage of the CMOS RTC wallclock.
> > > +
> > > + * `efi` force usage of the EFI_GET_TIME run-time method when booted from EFI
> > > +   firmware.
> > 
> > For both `xen` and `efi`, something should be said about "if selected
> > and not satisfied, Xen falls back to other heuristics".

There's a line just below that notes this: "If the selected option is
invalid or not available Xen will default to `auto`."  I think it's
clearer than having to comment on every specific option.

> > > +
> > > +If the selected option is invalid or not available Xen will default to `auto`.
> > 
> > I'm afraid that I'm firmly of the opinion that "auto" on the cmdline is
> > unnecessary complexity.  Auto is the default, and doesn't need
> > specifying explicitly.  That in turn simplifies ...
> 
> What about overriding earlier choice? For example overriding a built-in
> cmdline? That said, with the current code, the same can be achieved with
> wallclock=foo, and living with the warning in boot log...

It's IMO weird to ask the users to use wallclock=foo to override a
previous command line wallclock option and fallback to the default
behavior.

> > > +
> > >  ### watchdog (x86)
> > >  > `= force | <boolean>`
> > >  
> > > diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
> > > index 29b026735e5d..e4751684951e 100644
> > > --- a/xen/arch/x86/time.c
> > > +++ b/xen/arch/x86/time.c
> > > @@ -1552,6 +1552,37 @@ static const char *__init wallclock_type_to_string(void)
> > >      return "";
> > >  }
> > >  
> > > +static int __init cf_check parse_wallclock(const char *arg)
> > > +{
> > > +    wallclock_source = WALLCLOCK_UNSET;
> > > +
> > > +    if ( !arg )
> > > +        return -EINVAL;
> > > +
> > > +    if ( !strcmp("auto", arg) )
> > > +        ASSERT(wallclock_source == WALLCLOCK_UNSET);
> > 
> > ... this.
> > 
> > Hitting this assert will manifest as a system reboot/hang with no
> > information on serial/VGA, because all of this runs prior to getting up
> > the console.  You only get to see it on a PVH boot because we bodge the
> > Xen console into default-existence.
> 
> This assert is no-op as wallclock_source is unconditionally set to WALLCLOCK_UNSET few lines above.

As mentioned to Jan - I find it nicer than adding an empty statement.

> > So, ASSERT()/etc really need avoiding wherever possible in cmdline parsing.
> > 
> > In this case, all it serves to do is break examples like `wallclock=xen
> > wallclock=auto` case, which is unlike our latest-takes-precedence
> > behaviour everywhere else.
> > 
> > > +    else if ( !strcmp("xen", arg) )
> > > +    {
> > > +        if ( !xen_guest )
> > 
> > We don't normally treat this path as an error when parsing (we know what
> > it is, even if we can't action it).  Instead, there's no_config_param()
> > to be more friendly (for PVH at least).
> > 
> > It's a bit awkward, but this should do:
> > 
> >     {
> > #ifdef CONFIG_XEN_GUEST
> >         wallclock_source = WALLCLOCK_XEN;
> > #else
> >         no_config_param("XEN_GUEST", "wallclock", s, ss);
> > #endif
> >     }
> 
> Can you boot the binary build with CONFIG_XEN_GUEST=y as native? If so,
> the above will not be enough, a runtime check is needed anyway.
> 
> > There probably wants to be something similar for EFI, although it's not
> > a plain CONFIG so it might be more tricky.
> 
> It needs to be runtime check here even more. Not only because of
> different boot modes, but due to interaction with efi=no-rs (or any
> other reason for not having runtime services). See the comment there.

I agree with Marek, no_config_param() is not enough here: Xen needs to
ensure it has been booted as a Xen guest, or that EFI run-time
services are enabled in order to ensure the selected clock source is
available.

Thanks, Roger.
diff mbox series

Patch

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 959cf45b55d9..2a9b3b9b8975 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -2816,6 +2816,27 @@  vwfi to `native` reduces irq latency significantly. It can also lead to
 suboptimal scheduling decisions, but only when the system is
 oversubscribed (i.e., in total there are more vCPUs than pCPUs).
 
+### wallclock (x86)
+> `= auto | xen | cmos | efi`
+
+> Default: `auto`
+
+Allow forcing the usage of a specific wallclock source.
+
+ * `auto` let the hypervisor select the clocksource based on internal
+   heuristics.
+
+ * `xen` force usage of the Xen shared_info wallclock when booted as a Xen
+   guest.  This option is only available if the hypervisor was compiled with
+   `CONFIG_XEN_GUEST` enabled.
+
+ * `cmos` force usage of the CMOS RTC wallclock.
+
+ * `efi` force usage of the EFI_GET_TIME run-time method when booted from EFI
+   firmware.
+
+If the selected option is invalid or not available Xen will default to `auto`.
+
 ### watchdog (x86)
 > `= force | <boolean>`
 
diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index 29b026735e5d..e4751684951e 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1552,6 +1552,37 @@  static const char *__init wallclock_type_to_string(void)
     return "";
 }
 
+static int __init cf_check parse_wallclock(const char *arg)
+{
+    wallclock_source = WALLCLOCK_UNSET;
+
+    if ( !arg )
+        return -EINVAL;
+
+    if ( !strcmp("auto", arg) )
+        ASSERT(wallclock_source == WALLCLOCK_UNSET);
+    else if ( !strcmp("xen", arg) )
+    {
+        if ( !xen_guest )
+            return -EINVAL;
+
+        wallclock_source = WALLCLOCK_XEN;
+    }
+    else if ( !strcmp("cmos", arg) )
+        wallclock_source = WALLCLOCK_CMOS;
+    else if ( !strcmp("efi", arg) )
+        /*
+         * Checking if run-time services are available must be done after
+         * command line parsing.
+         */
+        wallclock_source = WALLCLOCK_EFI;
+    else
+        return -EINVAL;
+
+    return 0;
+}
+custom_param("wallclock", parse_wallclock);
+
 static void __init probe_wallclock(void)
 {
     ASSERT(wallclock_source == WALLCLOCK_UNSET);
@@ -2527,7 +2558,15 @@  int __init init_xen_time(void)
 
     open_softirq(TIME_CALIBRATE_SOFTIRQ, local_time_calibration);
 
-    probe_wallclock();
+    /*
+     * EFI run time services can be disabled from the command line, hence the
+     * check for them cannot be done as part of the wallclock option parsing.
+     */
+    if ( wallclock_source == WALLCLOCK_EFI && !efi_enabled(EFI_RS) )
+        wallclock_source = WALLCLOCK_UNSET;
+
+    if ( wallclock_source == WALLCLOCK_UNSET )
+        probe_wallclock();
 
     printk(XENLOG_INFO "Wallclock source: %s\n", wallclock_type_to_string());