diff mbox series

[v2] x86/AMD: unbreak CPU hotplug on AMD systems without RstrFpErrPtrs

Message ID 1575417367-12822-1-git-send-email-igor.druzhinin@citrix.com (mailing list archive)
State New, archived
Headers show
Series [v2] x86/AMD: unbreak CPU hotplug on AMD systems without RstrFpErrPtrs | expand

Commit Message

Igor Druzhinin Dec. 3, 2019, 11:56 p.m. UTC
If the feature is not present Xen will try to force X86_BUG_FPU_PTRS
feature at CPU identification time. This is especially noticeable in
PV-shim that usually hotplugs its vCPUs. We either need to restrict this
action for boot CPU only or allow secondary CPUs to modify
forced CPU capabilities at runtime. Choose the former since modifying
forced capabilities out of boot path leaves the system in potentially
inconsistent state.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
---
Changes in v2:
- pick the former approach instead of the latter
---
 xen/arch/x86/cpu/amd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jan Beulich Dec. 4, 2019, 9:13 a.m. UTC | #1
On 04.12.2019 00:56, Igor Druzhinin wrote:
> If the feature is not present Xen will try to force X86_BUG_FPU_PTRS
> feature at CPU identification time. This is especially noticeable in
> PV-shim that usually hotplugs its vCPUs. We either need to restrict this
> action for boot CPU only or allow secondary CPUs to modify
> forced CPU capabilities at runtime. Choose the former since modifying
> forced capabilities out of boot path leaves the system in potentially
> inconsistent state.
> 
> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

Andrew,

please clarify explicitly whether you're okay with this going in
despite your earlier objection, or whether you want us to follow
the outlined alternative route (or yet some different approach).
FAOD I continue to insist that the __init not be dropped from the
two functions in question.

Jan
Jan Beulich Dec. 10, 2019, 10:10 a.m. UTC | #2
On 04.12.2019 00:56, Igor Druzhinin wrote:
> If the feature is not present Xen will try to force X86_BUG_FPU_PTRS
> feature at CPU identification time. This is especially noticeable in
> PV-shim that usually hotplugs its vCPUs. We either need to restrict this
> action for boot CPU only or allow secondary CPUs to modify
> forced CPU capabilities at runtime. Choose the former since modifying
> forced capabilities out of boot path leaves the system in potentially
> inconsistent state.
> 
> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>

I've committed this to unstable, as per the outcome of the
community call. What about this for 4.13? Iirc the breakage was
introduced during this development cycle.

Jan

> ---
> Changes in v2:
> - pick the former approach instead of the latter
> ---
>  xen/arch/x86/cpu/amd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
> index fec2830..8b5f0f2 100644
> --- a/xen/arch/x86/cpu/amd.c
> +++ b/xen/arch/x86/cpu/amd.c
> @@ -583,7 +583,7 @@ static void init_amd(struct cpuinfo_x86 *c)
>  	 * Older AMD CPUs don't save/load FOP/FIP/FDP unless an FPU exception
>  	 * is pending.  Xen works around this at (F)XRSTOR time.
>  	 */
> -	if (!cpu_has(c, X86_FEATURE_RSTR_FP_ERR_PTRS))
> +	if (c == &boot_cpu_data && !cpu_has(c, X86_FEATURE_RSTR_FP_ERR_PTRS))
>  		setup_force_cpu_cap(X86_BUG_FPU_PTRS);
>  
>  	/*
>
Jürgen Groß Dec. 10, 2019, 10:26 a.m. UTC | #3
On 10.12.19 11:10, Jan Beulich wrote:
> On 04.12.2019 00:56, Igor Druzhinin wrote:
>> If the feature is not present Xen will try to force X86_BUG_FPU_PTRS
>> feature at CPU identification time. This is especially noticeable in
>> PV-shim that usually hotplugs its vCPUs. We either need to restrict this
>> action for boot CPU only or allow secondary CPUs to modify
>> forced CPU capabilities at runtime. Choose the former since modifying
>> forced capabilities out of boot path leaves the system in potentially
>> inconsistent state.
>>
>> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
> 
> I've committed this to unstable, as per the outcome of the
> community call. What about this for 4.13? Iirc the breakage was
> introduced during this development cycle.

In this case:

Release-acked-by: Juergen Gross <jgross@suse.com>


Juergen
Andrew Cooper Dec. 10, 2019, 12:37 p.m. UTC | #4
On 10/12/2019 10:26, Jürgen Groß wrote:
> On 10.12.19 11:10, Jan Beulich wrote:
>> On 04.12.2019 00:56, Igor Druzhinin wrote:
>>> If the feature is not present Xen will try to force X86_BUG_FPU_PTRS
>>> feature at CPU identification time. This is especially noticeable in
>>> PV-shim that usually hotplugs its vCPUs. We either need to restrict
>>> this
>>> action for boot CPU only or allow secondary CPUs to modify
>>> forced CPU capabilities at runtime. Choose the former since modifying
>>> forced capabilities out of boot path leaves the system in potentially
>>> inconsistent state.
>>>
>>> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
>>
>> I've committed this to unstable, as per the outcome of the
>> community call.

What outcome?  Yes technically your R-by is sufficient to get the patch
in, but you know very well there are open objections against this
version of the patch.

Also, you're actually in a position where you are reviewing your own
work, which is not how R-by is intended to work.

Furthermore, you will observe that there is an action item on me from
the call to come up with a less broken alternative which I'm genuinely
attempting to do.

~Andrew
Jan Beulich Dec. 10, 2019, 1:58 p.m. UTC | #5
On 10.12.2019 13:37, Andrew Cooper wrote:
> On 10/12/2019 10:26, Jürgen Groß wrote:
>> On 10.12.19 11:10, Jan Beulich wrote:
>>> On 04.12.2019 00:56, Igor Druzhinin wrote:
>>>> If the feature is not present Xen will try to force X86_BUG_FPU_PTRS
>>>> feature at CPU identification time. This is especially noticeable in
>>>> PV-shim that usually hotplugs its vCPUs. We either need to restrict
>>>> this
>>>> action for boot CPU only or allow secondary CPUs to modify
>>>> forced CPU capabilities at runtime. Choose the former since modifying
>>>> forced capabilities out of boot path leaves the system in potentially
>>>> inconsistent state.
>>>>
>>>> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
>>>
>>> I've committed this to unstable, as per the outcome of the
>>> community call.
> 
> What outcome?  Yes technically your R-by is sufficient to get the patch
> in, but you know very well there are open objections against this
> version of the patch.

My proposal on the call was to go with the existing patch, and improve
from there. There wasn't great enthusiasm, but there was agreement that
this is the most pragmatic route to follow. In particular I don't
recall you voicing any objection to this on the call (I do very well
recall the objection you voiced earlier on, which I had responded to
with a suggestion of a slightly different approach, taking care [I
think] of your wishes as well as mine).

> Also, you're actually in a position where you are reviewing your own
> work, which is not how R-by is intended to work.

Which own work? The patch doesn't even carry a Suggested-by. Iirc
Igor told me that what has gone in is how he had it initially. So
I'm pretty confused by this statement of yours.

Furthermore, as a recurring pattern, simply not responding to ongoing
discussions should not, in the common case, lead to no progress at
all. Iirc this was also mentioned on the call ("lazy consensus").

> Furthermore, you will observe that there is an action item on me from
> the call to come up with a less broken alternative which I'm genuinely
> attempting to do.

There's no indication towards this in the minutes, afaics. Or wait
- the same topic appears twice there (as both 4 and 6). I wasn't
even aware of such an action item. I'll be happy to revert if you
indicate so, and if a better fix is going to show up in time.

Jan
Igor Druzhinin Dec. 10, 2019, 2:07 p.m. UTC | #6
On 10/12/2019 13:58, Jan Beulich wrote:
> On 10.12.2019 13:37, Andrew Cooper wrote:
>> Furthermore, you will observe that there is an action item on me from
>> the call to come up with a less broken alternative which I'm genuinely
>> attempting to do.
> 
> There's no indication towards this in the minutes, afaics. Or wait
> - the same topic appears twice there (as both 4 and 6). I wasn't
> even aware of such an action item. I'll be happy to revert if you
> indicate so, and if a better fix is going to show up in time.

I don't think reverting would make sense - the patch doesn't break
anything, even more - it actually fixes the problem we observed.
If there is an improvement to that coming - it should be just done
on top of this also potentially taking care of other places in the code
that might be affected.

Igor
diff mbox series

Patch

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index fec2830..8b5f0f2 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -583,7 +583,7 @@  static void init_amd(struct cpuinfo_x86 *c)
 	 * Older AMD CPUs don't save/load FOP/FIP/FDP unless an FPU exception
 	 * is pending.  Xen works around this at (F)XRSTOR time.
 	 */
-	if (!cpu_has(c, X86_FEATURE_RSTR_FP_ERR_PTRS))
+	if (c == &boot_cpu_data && !cpu_has(c, X86_FEATURE_RSTR_FP_ERR_PTRS))
 		setup_force_cpu_cap(X86_BUG_FPU_PTRS);
 
 	/*