diff mbox series

[v4] x86/mwait-idle: re-order state entry/exit code a little

Message ID 2db7bc79-4fe4-abf6-9e5a-83055af9a78a@suse.com (mailing list archive)
State New, archived
Headers show
Series [v4] x86/mwait-idle: re-order state entry/exit code a little | expand

Commit Message

Jan Beulich Feb. 18, 2022, 8:35 a.m. UTC
The initial observation is that unlike the original ACPI idle driver we
have a 2nd cpu_is_haltable() in here. By making the actual state entry
conditional, the emitted trace records as well as the subsequent stats
update are at least misleading in case the state wasn't actually entered.
Hence they would want moving inside the conditional. At which point the
cpuidle_get_tick() invocations could (and hence should) move as well.
cstate_restore_tsc() also isn't needed if we didn't actually enter the
state.

This leaves only the errata_c6_workaround() and lapic_timer_off()
invocations outside the conditional. As a result it looks easier to
drop the conditional (and come back in sync with the other driver again)
than to move almost everything into the conditional.

While there also move the TRACE_6D() out of the IRQ-disabled region.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Moving the TRACE_6D() may be a little controversial, as this could lead
to a sequence of trace records not actually representing the sequence of
events, in case further records get emitted by interrupt handlers. But
with us now conditionally enabling interrupts around MWAIT, that issue
exists already anyway.

Unlike said in the earlier outline of the alternative approach,
errata_c6_workaround() cannot be moved: cpu_has_pending_apic_eoi() needs
to be called when IRQs are already off.
---
v4: Different approach (and title), as was previously outlined as an
    alternative.
v3: Also move cstate_restore_tsc() invocation and split ones to
    update_idle_stats().
v2: New.

Comments

Roger Pau Monné Feb. 22, 2022, 9:08 a.m. UTC | #1
On Fri, Feb 18, 2022 at 09:35:10AM +0100, Jan Beulich wrote:
> The initial observation is that unlike the original ACPI idle driver we
> have a 2nd cpu_is_haltable() in here. By making the actual state entry
> conditional, the emitted trace records as well as the subsequent stats
> update are at least misleading in case the state wasn't actually entered.
> Hence they would want moving inside the conditional. At which point the
> cpuidle_get_tick() invocations could (and hence should) move as well.
> cstate_restore_tsc() also isn't needed if we didn't actually enter the
> state.
> 
> This leaves only the errata_c6_workaround() and lapic_timer_off()
> invocations outside the conditional. As a result it looks easier to
> drop the conditional (and come back in sync with the other driver again)
> than to move almost everything into the conditional.
> 
> While there also move the TRACE_6D() out of the IRQ-disabled region.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> ---
> Moving the TRACE_6D() may be a little controversial, as this could lead
> to a sequence of trace records not actually representing the sequence of
> events, in case further records get emitted by interrupt handlers. But
> with us now conditionally enabling interrupts around MWAIT, that issue
> exists already anyway.

I think that's OK. We have to priority interrupt latency over trace
readability.

> Unlike said in the earlier outline of the alternative approach,
> errata_c6_workaround() cannot be moved: cpu_has_pending_apic_eoi() needs
> to be called when IRQs are already off.
> ---
> v4: Different approach (and title), as was previously outlined as an
>     alternative.
> v3: Also move cstate_restore_tsc() invocation and split ones to
>     update_idle_stats().
> v2: New.
> 
> --- a/xen/arch/x86/cpu/mwait-idle.c
> +++ b/xen/arch/x86/cpu/mwait-idle.c
> @@ -847,26 +847,25 @@ static void mwait_idle(void)
>  
>  	update_last_cx_stat(power, cx, before);
>  
> -	if (cpu_is_haltable(cpu)) {
> -		if (cx->irq_enable_early)
> -			local_irq_enable();
> +	if (cx->irq_enable_early)
> +		local_irq_enable();

Now that I look at this again, we need to be careful with the enabling
interrupts and the interaction with errata_c6_workaround.  Enabling
interrupts here could change the result of the check for pending EOIs,
and thus enter mwait with a condition that could trigger the erratas.
Hopefully CPUIDLE_FLAG_IRQ_ENABLE is only set for C1.

It might be prudent to only allow setting CPUIDLE_FLAG_IRQ_ENABLE for
states <= 2.

Thanks, Roger.
Jan Beulich Feb. 22, 2022, 9:34 a.m. UTC | #2
On 22.02.2022 10:08, Roger Pau Monné wrote:
> On Fri, Feb 18, 2022 at 09:35:10AM +0100, Jan Beulich wrote:
>> The initial observation is that unlike the original ACPI idle driver we
>> have a 2nd cpu_is_haltable() in here. By making the actual state entry
>> conditional, the emitted trace records as well as the subsequent stats
>> update are at least misleading in case the state wasn't actually entered.
>> Hence they would want moving inside the conditional. At which point the
>> cpuidle_get_tick() invocations could (and hence should) move as well.
>> cstate_restore_tsc() also isn't needed if we didn't actually enter the
>> state.
>>
>> This leaves only the errata_c6_workaround() and lapic_timer_off()
>> invocations outside the conditional. As a result it looks easier to
>> drop the conditional (and come back in sync with the other driver again)
>> than to move almost everything into the conditional.
>>
>> While there also move the TRACE_6D() out of the IRQ-disabled region.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

>> --- a/xen/arch/x86/cpu/mwait-idle.c
>> +++ b/xen/arch/x86/cpu/mwait-idle.c
>> @@ -847,26 +847,25 @@ static void mwait_idle(void)
>>  
>>  	update_last_cx_stat(power, cx, before);
>>  
>> -	if (cpu_is_haltable(cpu)) {
>> -		if (cx->irq_enable_early)
>> -			local_irq_enable();
>> +	if (cx->irq_enable_early)
>> +		local_irq_enable();
> 
> Now that I look at this again, we need to be careful with the enabling
> interrupts and the interaction with errata_c6_workaround.  Enabling
> interrupts here could change the result of the check for pending EOIs,
> and thus enter mwait with a condition that could trigger the erratas.
> Hopefully CPUIDLE_FLAG_IRQ_ENABLE is only set for C1.
> 
> It might be prudent to only allow setting CPUIDLE_FLAG_IRQ_ENABLE for
> states <= 2.

Well, the justification for enabling was the low exit time. I don't
expect states > 2 to satisfy this criteria, so I think we're okay
without further precautions added right away.

Jan
diff mbox series

Patch

--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -847,26 +847,25 @@  static void mwait_idle(void)
 
 	update_last_cx_stat(power, cx, before);
 
-	if (cpu_is_haltable(cpu)) {
-		if (cx->irq_enable_early)
-			local_irq_enable();
+	if (cx->irq_enable_early)
+		local_irq_enable();
 
-		mwait_idle_with_hints(cx->address, MWAIT_ECX_INTERRUPT_BREAK);
+	mwait_idle_with_hints(cx->address, MWAIT_ECX_INTERRUPT_BREAK);
 
-		local_irq_disable();
-	}
+	local_irq_disable();
 
 	after = alternative_call(cpuidle_get_tick);
 
 	cstate_restore_tsc();
 	trace_exit_reason(irq_traced);
-	TRACE_6D(TRC_PM_IDLE_EXIT, cx->type, after,
-		irq_traced[0], irq_traced[1], irq_traced[2], irq_traced[3]);
 
 	/* Now back in C0. */
 	update_idle_stats(power, cx, before, after);
 	local_irq_enable();
 
+	TRACE_6D(TRC_PM_IDLE_EXIT, cx->type, after,
+		irq_traced[0], irq_traced[1], irq_traced[2], irq_traced[3]);
+
 	if (!(lapic_timer_reliable_states & (1 << cx->type)))
 		lapic_timer_on();