diff mbox series

[4.15] x86/HPET: don't enable legacy replacement mode unconditionally

Message ID 8e18a2a5-bc19-615d-0c8c-cea49adcf976@suse.com (mailing list archive)
State New, archived
Headers show
Series [4.15] x86/HPET: don't enable legacy replacement mode unconditionally | expand

Commit Message

Jan Beulich March 24, 2021, 10:34 a.m. UTC
Commit e1de4c196a2e ("x86/timer: Fix boot on Intel systems using ITSSPRC
static PIT clock gating") was reported to cause boot failures on certain
AMD Ryzen systems. Until we can figure out what the actual issue there
is, skip this new part of HPET setup by default. Introduce a "hpet"
command line option to allow enabling this on hardware where it's really
needed for Xen to boot successfully (i.e. where the PIT doesn't drive
the timer interrupt).

Since it makes little sense to introduce just "hpet=legacy-replacement",
also allow for a boolean argument as well as "broadcast" to replace the
separate "hpetbroadcast" option.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

Comments

Ian Jackson March 24, 2021, 10:52 a.m. UTC | #1
Jan Beulich writes ("[PATCH][4.15] x86/HPET: don't enable legacy replacement mode unconditionally"):
> Commit e1de4c196a2e ("x86/timer: Fix boot on Intel systems using ITSSPRC
> static PIT clock gating") was reported to cause boot failures on certain
> AMD Ryzen systems. Until we can figure out what the actual issue there
> is, skip this new part of HPET setup by default. Introduce a "hpet"
> command line option to allow enabling this on hardware where it's really
> needed for Xen to boot successfully (i.e. where the PIT doesn't drive
> the timer interrupt).
> 
> Since it makes little sense to introduce just "hpet=legacy-replacement",
> also allow for a boolean argument as well as "broadcast" to replace the
> separate "hpetbroadcast" option.

Release-Acked-by: Ian Jackson <iwj@xenproject.org>

Thanks.  I would like to ee it committed by the end of the week.

I don't feel qualified to review this.  I'm slightly uncomfortable
with the command line rework but I think as you say there probably
isn't a good more-minimal version.

I would like to ask the reviewer(s) to focus on this question: if the
option is not passed, does this revert the behaviour to be identical
to the behaviour of Xen before e1de4c196a2e ?

Thanks,
Ian.
Tamas K Lengyel March 24, 2021, 11:37 a.m. UTC | #2
On Wed, Mar 24, 2021 at 6:34 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> Commit e1de4c196a2e ("x86/timer: Fix boot on Intel systems using ITSSPRC
> static PIT clock gating") was reported to cause boot failures on certain
> AMD Ryzen systems. Until we can figure out what the actual issue there
> is, skip this new part of HPET setup by default. Introduce a "hpet"
> command line option to allow enabling this on hardware where it's really
> needed for Xen to boot successfully (i.e. where the PIT doesn't drive
> the timer interrupt).
>
> Since it makes little sense to introduce just "hpet=legacy-replacement",
> also allow for a boolean argument as well as "broadcast" to replace the
> separate "hpetbroadcast" option.

While having the command line option to control it is fine what would
really be the best solution is if Xen could figure out when the
legacy-replacement option is necessary to begin with and enable it on
its own, even if it's done as a fallback route. We'll have issues with
telling users when the option is needed and when it isn't. I don't
like the idea of users having to go through a route of "well, let's
see if Xen boots and if you get this weird crash/reboot add this
obscure boot option". It's just a bad user experience all around.

Tamas
Jan Beulich March 24, 2021, 1:40 p.m. UTC | #3
On 24.03.2021 12:37, Tamas K Lengyel wrote:
> On Wed, Mar 24, 2021 at 6:34 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> Commit e1de4c196a2e ("x86/timer: Fix boot on Intel systems using ITSSPRC
>> static PIT clock gating") was reported to cause boot failures on certain
>> AMD Ryzen systems. Until we can figure out what the actual issue there
>> is, skip this new part of HPET setup by default. Introduce a "hpet"
>> command line option to allow enabling this on hardware where it's really
>> needed for Xen to boot successfully (i.e. where the PIT doesn't drive
>> the timer interrupt).
>>
>> Since it makes little sense to introduce just "hpet=legacy-replacement",
>> also allow for a boolean argument as well as "broadcast" to replace the
>> separate "hpetbroadcast" option.
> 
> While having the command line option to control it is fine what would
> really be the best solution is if Xen could figure out when the
> legacy-replacement option is necessary to begin with and enable it on
> its own, even if it's done as a fallback route.

This was the original plan, but no patch has arrived by now. I can
imagine this being due to things being easier to state than to
actually carry out. Plus of course this fallback approach still
isn't ideal - even better would be if we could address the actual
failure. I for one lack sufficient technical data to at least try
to think of possible solutions.

> We'll have issues with
> telling users when the option is needed and when it isn't. I don't
> like the idea of users having to go through a route of "well, let's
> see if Xen boots and if you get this weird crash/reboot add this
> obscure boot option". It's just a bad user experience all around.

I can't see how it's worse than what we've had so far, crashing
during boot _without_ there being any option available.

Jan
Ian Jackson March 26, 2021, 4:43 p.m. UTC | #4
Jan Beulich writes ("[PATCH][4.15] x86/HPET: don't enable legacy replacement mode unconditionally"):
> Commit e1de4c196a2e ("x86/timer: Fix boot on Intel systems using ITSSPRC
> static PIT clock gating") was reported to cause boot failures on certain
> AMD Ryzen systems. Until we can figure out what the actual issue there
> is, skip this new part of HPET setup by default. Introduce a "hpet"
> command line option to allow enabling this on hardware where it's really
> needed for Xen to boot successfully (i.e. where the PIT doesn't drive
> the timer interrupt).
> 
> Since it makes little sense to introduce just "hpet=legacy-replacement",
> also allow for a boolean argument as well as "broadcast" to replace the
> separate "hpetbroadcast" option.

Reviewed-by: Ian Jackson <iwj@xenproject.org>

I have to say that this

   -    if ( hpet_rate )
   +    if ( hpet_rate || !hpet_address || !opt_hpet )
            return hpet_rate;

   -    if ( hpet_address == 0 )
   -        return 0;
   -

is to my mind quite an obscure coding style.

Do we really want to return a nozero hpet_rate even if !opt_hpet ?

I would have said

   +
   +    if ( hpet_address == 0 || !opt_hpet )
   +        return 0;

        if ( hpet_rate )
        if ( hpet_rate )
            return hpet_rate;

   -    if ( hpet_address == 0 )
   -        return 0;
   -

But Andy's version of expresses it the same way so fine, if that's the
way you like to do things, and hpet_opt is new in this patch so I
don't consider it a crisis if it doesn't work right.

Ian.
Jan Beulich March 26, 2021, 4:52 p.m. UTC | #5
On 26.03.2021 17:43, Ian Jackson wrote:
> Jan Beulich writes ("[PATCH][4.15] x86/HPET: don't enable legacy replacement mode unconditionally"):
>> Commit e1de4c196a2e ("x86/timer: Fix boot on Intel systems using ITSSPRC
>> static PIT clock gating") was reported to cause boot failures on certain
>> AMD Ryzen systems. Until we can figure out what the actual issue there
>> is, skip this new part of HPET setup by default. Introduce a "hpet"
>> command line option to allow enabling this on hardware where it's really
>> needed for Xen to boot successfully (i.e. where the PIT doesn't drive
>> the timer interrupt).
>>
>> Since it makes little sense to introduce just "hpet=legacy-replacement",
>> also allow for a boolean argument as well as "broadcast" to replace the
>> separate "hpetbroadcast" option.
> 
> Reviewed-by: Ian Jackson <iwj@xenproject.org>

Thanks, but with Andrew's pending objection I don't feel like
committing it.

> I have to say that this
> 
>    -    if ( hpet_rate )
>    +    if ( hpet_rate || !hpet_address || !opt_hpet )
>             return hpet_rate;
> 
>    -    if ( hpet_address == 0 )
>    -        return 0;
>    -
> 
> is to my mind quite an obscure coding style.
> 
> Do we really want to return a nozero hpet_rate even if !opt_hpet ?

We won't: hpet_rate will be set to non-zero only further down in
the function.

Jan
Ian Jackson March 26, 2021, 4:54 p.m. UTC | #6
Jan Beulich writes ("Re: [PATCH][4.15] x86/HPET: don't enable legacy replacement mode unconditionally"):
> Thanks, but with Andrew's pending objection I don't feel like
> committing it.

I understand.

> > I have to say that this
> > 
> >    -    if ( hpet_rate )
> >    +    if ( hpet_rate || !hpet_address || !opt_hpet )
> >             return hpet_rate;
> > 
> >    -    if ( hpet_address == 0 )
> >    -        return 0;
> >    -
> > 
> > is to my mind quite an obscure coding style.
> > 
> > Do we really want to return a nozero hpet_rate even if !opt_hpet ?
> 
> We won't: hpet_rate will be set to non-zero only further down in
> the function.

Oh, I see.  Right.  Thanks for the quick reply.

Ian.
Roger Pau Monne March 26, 2021, 5 p.m. UTC | #7
On Wed, Mar 24, 2021 at 11:34:32AM +0100, Jan Beulich wrote:
> Commit e1de4c196a2e ("x86/timer: Fix boot on Intel systems using ITSSPRC
> static PIT clock gating") was reported to cause boot failures on certain
> AMD Ryzen systems. Until we can figure out what the actual issue there
> is, skip this new part of HPET setup by default. Introduce a "hpet"
> command line option to allow enabling this on hardware where it's really
> needed for Xen to boot successfully (i.e. where the PIT doesn't drive
> the timer interrupt).
> 
> Since it makes little sense to introduce just "hpet=legacy-replacement",
> also allow for a boolean argument as well as "broadcast" to replace the
> separate "hpetbroadcast" option.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

I think the commit does what it saying on the commit message, hence:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

I would like to avoid such RB being seen as me deciding on which
option is best release wise.

Haven't followed the other discussion closely as I'm on PTO today, but
maybe it's an issue worth thinking over during the weekend?

> 
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1274,9 +1274,26 @@ supported. See docs/misc/arm/big.LITTLE.
>  When the hmp-unsafe option is disabled (default), CPUs that are not
>  identical to the boot CPU will be parked and not used by Xen.
>  
> +### hpet (x86)
> +> `= List of [ <boolean> | broadcast | legacy-replacement ]`
> +
> +> Default : `true`, `no-broadcast`, 'no-legacy-replacement`
> +
> +Controls Xen's use of the system's High Precision Event Timer.  The boolean
> +allows to turn off use altogether.
> +
> +`broadcast` forces Xen to keep using the broadcast for CPUs in deep C-states
> +even when an RTC interrupt got enabled.
> +
> +`legacy-replacement` is intended to be used on platforms where the timer
> +interrupt doesn't get raised by the legacy PIT.  This then also affects
> +raising of the RTC interrupt.

I think Andrew rework of the change moved the x86 tag to a field on
the description instead of being in the title of the option and
arranged the options to be in list format, we might want to use that
instead, but can be adjusted later I guess since that would be a
documentation change.

Thanks, Roger.
diff mbox series

Patch

--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -1274,9 +1274,26 @@  supported. See docs/misc/arm/big.LITTLE.
 When the hmp-unsafe option is disabled (default), CPUs that are not
 identical to the boot CPU will be parked and not used by Xen.
 
+### hpet (x86)
+> `= List of [ <boolean> | broadcast | legacy-replacement ]`
+
+> Default : `true`, `no-broadcast`, 'no-legacy-replacement`
+
+Controls Xen's use of the system's High Precision Event Timer.  The boolean
+allows to turn off use altogether.
+
+`broadcast` forces Xen to keep using the broadcast for CPUs in deep C-states
+even when an RTC interrupt got enabled.
+
+`legacy-replacement` is intended to be used on platforms where the timer
+interrupt doesn't get raised by the legacy PIT.  This then also affects
+raising of the RTC interrupt.
+
 ### hpetbroadcast (x86)
 > `= <boolean>`
 
+Deprecated alternative of `hpet=broadcast`.
+
 ### hvm_debug (x86)
 > `= <integer>`
 
--- a/xen/arch/x86/hpet.c
+++ b/xen/arch/x86/hpet.c
@@ -52,6 +52,8 @@  static unsigned int __read_mostly num_hp
 DEFINE_PER_CPU(struct hpet_event_channel *, cpu_bc_channel);
 
 unsigned long __initdata hpet_address;
+static bool __initdata opt_hpet = true;
+static bool __initdata opt_legacy_replacement;
 u8 __initdata hpet_blockid;
 u8 __initdata hpet_flags;
 
@@ -63,6 +65,32 @@  u8 __initdata hpet_flags;
 static bool __initdata force_hpet_broadcast;
 boolean_param("hpetbroadcast", force_hpet_broadcast);
 
+static int __init parse_hpet_param(const char *s)
+{
+    const char *ss;
+    int val, rc = 0;
+
+    do {
+        ss = strchr(s, ',');
+        if ( !ss )
+            ss = strchr(s, '\0');
+
+        if ( (val = parse_bool(s, ss)) >= 0 )
+            opt_hpet = val;
+        else if ( (val = parse_boolean("broadcast", s, ss)) >= 0 )
+            force_hpet_broadcast = val;
+        else if ( (val = parse_boolean("legacy-replacement", s, ss)) >= 0 )
+            opt_legacy_replacement = val;
+        else
+            rc = -EINVAL;
+
+        s = ss + 1;
+    } while ( *ss );
+
+    return rc;
+}
+custom_param("hpet", parse_hpet_param);
+
 /*
  * Calculate a multiplication factor for scaled math, which is used to convert
  * nanoseconds based values to clock ticks:
@@ -761,12 +789,9 @@  u64 __init hpet_setup(void)
     unsigned int hpet_id, hpet_period, hpet_cfg;
     unsigned int last, rem;
 
-    if ( hpet_rate )
+    if ( hpet_rate || !hpet_address || !opt_hpet )
         return hpet_rate;
 
-    if ( hpet_address == 0 )
-        return 0;
-
     set_fixmap_nocache(FIX_HPET_BASE, hpet_address);
 
     hpet_id = hpet_read32(HPET_ID);
@@ -803,9 +828,9 @@  u64 __init hpet_setup(void)
      * IRQ routing is configured.
      *
      * Reconfigure the HPET into legacy mode to re-establish the timer
-     * interrupt.
+     * interrupt, if available and if so requested.
      */
-    if ( hpet_id & HPET_ID_LEGSUP &&
+    if ( opt_legacy_replacement && (hpet_id & HPET_ID_LEGSUP) &&
          !((hpet_cfg = hpet_read32(HPET_CFG)) & HPET_CFG_LEGACY) )
     {
         unsigned int c0_cfg, ticks, count;