diff mbox series

[for-4.17,v2.1,2/3] amd/virt_ssbd: set SSBD at vCPU context switch

Message ID 20221029131258.95811-1-roger.pau@citrix.com (mailing list archive)
State New, archived
Headers show
Series None | expand

Commit Message

Roger Pau Monné Oct. 29, 2022, 1:12 p.m. UTC
The current logic for AMD SSBD context switches it on every
vm{entry,exit} if the Xen and guest selections don't match.  This is
expensive when not using SPEC_CTRL, and hence should be avoided as
much as possible.

When SSBD is not being set from SPEC_CTRL on AMD don't context switch
at vm{entry,exit} and instead only context switch SSBD when switching
vCPUs.  This has the side effect of running Xen code with the guest
selection of SSBD, the documentation is updated to note this behavior.
Also note that then when `ssbd` is selected on the command line guest
SSBD selection will not have an effect, and the hypervisor will run
with SSBD unconditionally enabled when not using SPEC_CTRL itself.

This fixes an issue with running C code in a GIF=0 region, that's
problematic when using UBSAN or other instrumentation techniques.

As a result of no longer running the code to set SSBD in a GIF=0
region the locking of amd_set_legacy_ssbd() can be done using normal
spinlocks, and some more checks can be added to assure it works as
intended.

Finally it's also worth noticing that since the guest SSBD selection
is no longer set on vmentry the VIRT_SPEC_MSR handling needs to
propagate the value to the hardware as part of handling the wrmsr.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v2:
 - Fix calling set_reg unconditionally.

Changes since v1:
 - Just check virt_spec_ctrl value != 0 on context switch.
 - Remove stray asm newline.
 - Use val in svm_set_reg().
 - Fix style in amd.c.
 - Do not clear ssbd
---
 docs/misc/xen-command-line.pandoc | 10 +++---
 xen/arch/x86/cpu/amd.c            | 55 +++++++++++++++++--------------
 xen/arch/x86/hvm/svm/entry.S      |  6 ----
 xen/arch/x86/hvm/svm/svm.c        | 49 ++++++++++++---------------
 xen/arch/x86/include/asm/amd.h    |  2 +-
 xen/arch/x86/msr.c                |  8 +++++
 6 files changed, 66 insertions(+), 64 deletions(-)

Comments

Jan Beulich Nov. 2, 2022, 11:49 a.m. UTC | #1
On 29.10.2022 15:12, Roger Pau Monne wrote:
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -973,6 +973,16 @@ static void cf_check svm_ctxt_switch_from(struct vcpu *v)
>  
>      /* Resume use of ISTs now that the host TR is reinstated. */
>      enable_each_ist(idt_tables[cpu]);
> +
> +    /*
> +     * Clear previous guest selection of SSBD if set.  Note that SPEC_CTRL.SSBD
> +     * is already cleared by svm_vmexit_spec_ctrl.
> +     */
> +    if ( v->arch.msrs->virt_spec_ctrl.raw & SPEC_CTRL_SSBD )
> +    {
> +        ASSERT(v->domain->arch.cpuid->extd.virt_ssbd);
> +        amd_set_ssbd(false);
> +    }
>  }

Aren't you potentially turning off SSBD here just to ...

> @@ -1000,6 +1010,13 @@ static void cf_check svm_ctxt_switch_to(struct vcpu *v)
>  
>      if ( cpu_has_msr_tsc_aux )
>          wrmsr_tsc_aux(v->arch.msrs->tsc_aux);
> +
> +    /* Load SSBD if set by the guest. */
> +    if ( v->arch.msrs->virt_spec_ctrl.raw & SPEC_CTRL_SSBD )
> +    {
> +        ASSERT(v->domain->arch.cpuid->extd.virt_ssbd);
> +        amd_set_ssbd(true);
> +    }
>  }

... turn it on here again? IOW wouldn't switching better be isolated to
just svm_ctxt_switch_to(), doing nothing if already in the intended mode?

> --- a/xen/arch/x86/msr.c
> +++ b/xen/arch/x86/msr.c
> @@ -697,7 +697,15 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
>                  msrs->spec_ctrl.raw &= ~SPEC_CTRL_SSBD;
>          }
>          else
> +        {
>              msrs->virt_spec_ctrl.raw = val & SPEC_CTRL_SSBD;
> +            if ( v == curr )
> +                /*
> +                 * Propagate the value to hardware, as it won't be context
> +                 * switched on vmentry.
> +                 */

I have to admit that I find "on vmentry" in the comment misleading: Reading
it I first thought you're still alluding to the old model. Plus I also find
the combination of "context switched" and "on vmentry" problematic, as we
generally mean something else when we say "context switch".

> +                goto set_reg;

It's not clear why you want to use hvm_set_reg() in the first place - the
comment says "propagate to hardware", which would mean wrmsrl() in the
usual case. Here it would mean a direct call to amd_set_ssbd() imo. That
would then also be in line with all other "v == curr" conditionals, none
of which apply to any "goto set_reg". ..._set_reg(), aiui, is meant only
for use in cases where vCPU state needs updating such that proper state
would be loaded later (e.g. during VM entry).

Jan
Roger Pau Monné Nov. 2, 2022, 5:38 p.m. UTC | #2
On Wed, Nov 02, 2022 at 12:49:17PM +0100, Jan Beulich wrote:
> On 29.10.2022 15:12, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/hvm/svm/svm.c
> > +++ b/xen/arch/x86/hvm/svm/svm.c
> > @@ -973,6 +973,16 @@ static void cf_check svm_ctxt_switch_from(struct vcpu *v)
> >  
> >      /* Resume use of ISTs now that the host TR is reinstated. */
> >      enable_each_ist(idt_tables[cpu]);
> > +
> > +    /*
> > +     * Clear previous guest selection of SSBD if set.  Note that SPEC_CTRL.SSBD
> > +     * is already cleared by svm_vmexit_spec_ctrl.
> > +     */
> > +    if ( v->arch.msrs->virt_spec_ctrl.raw & SPEC_CTRL_SSBD )
> > +    {
> > +        ASSERT(v->domain->arch.cpuid->extd.virt_ssbd);
> > +        amd_set_ssbd(false);
> > +    }
> >  }
> 
> Aren't you potentially turning off SSBD here just to ...
> 
> > @@ -1000,6 +1010,13 @@ static void cf_check svm_ctxt_switch_to(struct vcpu *v)
> >  
> >      if ( cpu_has_msr_tsc_aux )
> >          wrmsr_tsc_aux(v->arch.msrs->tsc_aux);
> > +
> > +    /* Load SSBD if set by the guest. */
> > +    if ( v->arch.msrs->virt_spec_ctrl.raw & SPEC_CTRL_SSBD )
> > +    {
> > +        ASSERT(v->domain->arch.cpuid->extd.virt_ssbd);
> > +        amd_set_ssbd(true);
> > +    }
> >  }
> 
> ... turn it on here again? IOW wouldn't switching better be isolated to
> just svm_ctxt_switch_to(), doing nothing if already in the intended mode?

What if we switch from a HVM vCPU into a PV one?  AFAICT then
svm_ctxt_switch_to() won't get called and we would be running the PV
guest with the previous HVM domain SSBD selection.


> > --- a/xen/arch/x86/msr.c
> > +++ b/xen/arch/x86/msr.c
> > @@ -697,7 +697,15 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
> >                  msrs->spec_ctrl.raw &= ~SPEC_CTRL_SSBD;
> >          }
> >          else
> > +        {
> >              msrs->virt_spec_ctrl.raw = val & SPEC_CTRL_SSBD;
> > +            if ( v == curr )
> > +                /*
> > +                 * Propagate the value to hardware, as it won't be context
> > +                 * switched on vmentry.
> > +                 */
> 
> I have to admit that I find "on vmentry" in the comment misleading: Reading
> it I first thought you're still alluding to the old model. Plus I also find
> the combination of "context switched" and "on vmentry" problematic, as we
> generally mean something else when we say "context switch".

I had a hard time wording this, because of the Xen/guest vs vCPU
context switches.

What about:

"Propagate the value to hardware, as it won't we set on guest resume
path."


> > +                goto set_reg;
> 
> It's not clear why you want to use hvm_set_reg() in the first place - the
> comment says "propagate to hardware", which would mean wrmsrl() in the
> usual case. Here it would mean a direct call to amd_set_ssbd() imo. That
> would then also be in line with all other "v == curr" conditionals, none
> of which apply to any "goto set_reg". ..._set_reg(), aiui, is meant only
> for use in cases where vCPU state needs updating such that proper state
> would be loaded later (e.g. during VM entry).

I thought it was better to hide those vendor specific calls in the
already existing vendor hooks (set_reg).  I don't mind calling
amd_set_ssbd() directly here if that's preferred, it seemed kind of a
layering violation when we have vendor specific hooks in place.

Thanks, Roger.
Jan Beulich Nov. 3, 2022, 8:09 a.m. UTC | #3
On 02.11.2022 18:38, Roger Pau Monné wrote:
> On Wed, Nov 02, 2022 at 12:49:17PM +0100, Jan Beulich wrote:
>> On 29.10.2022 15:12, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/hvm/svm/svm.c
>>> +++ b/xen/arch/x86/hvm/svm/svm.c
>>> @@ -973,6 +973,16 @@ static void cf_check svm_ctxt_switch_from(struct vcpu *v)
>>>  
>>>      /* Resume use of ISTs now that the host TR is reinstated. */
>>>      enable_each_ist(idt_tables[cpu]);
>>> +
>>> +    /*
>>> +     * Clear previous guest selection of SSBD if set.  Note that SPEC_CTRL.SSBD
>>> +     * is already cleared by svm_vmexit_spec_ctrl.
>>> +     */
>>> +    if ( v->arch.msrs->virt_spec_ctrl.raw & SPEC_CTRL_SSBD )
>>> +    {
>>> +        ASSERT(v->domain->arch.cpuid->extd.virt_ssbd);
>>> +        amd_set_ssbd(false);
>>> +    }
>>>  }
>>
>> Aren't you potentially turning off SSBD here just to ...
>>
>>> @@ -1000,6 +1010,13 @@ static void cf_check svm_ctxt_switch_to(struct vcpu *v)
>>>  
>>>      if ( cpu_has_msr_tsc_aux )
>>>          wrmsr_tsc_aux(v->arch.msrs->tsc_aux);
>>> +
>>> +    /* Load SSBD if set by the guest. */
>>> +    if ( v->arch.msrs->virt_spec_ctrl.raw & SPEC_CTRL_SSBD )
>>> +    {
>>> +        ASSERT(v->domain->arch.cpuid->extd.virt_ssbd);
>>> +        amd_set_ssbd(true);
>>> +    }
>>>  }
>>
>> ... turn it on here again? IOW wouldn't switching better be isolated to
>> just svm_ctxt_switch_to(), doing nothing if already in the intended mode?
> 
> What if we switch from a HVM vCPU into a PV one?  AFAICT then
> svm_ctxt_switch_to() won't get called and we would be running the PV
> guest with the previous HVM domain SSBD selection.

Would that be a problem? Or in other words: What is the intended behavior
for PV? PV domains can control SSBD via SPEC_CTRL (only), so all we need
to guarantee is that we respect their choice there.

>>> --- a/xen/arch/x86/msr.c
>>> +++ b/xen/arch/x86/msr.c
>>> @@ -697,7 +697,15 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
>>>                  msrs->spec_ctrl.raw &= ~SPEC_CTRL_SSBD;
>>>          }
>>>          else
>>> +        {
>>>              msrs->virt_spec_ctrl.raw = val & SPEC_CTRL_SSBD;
>>> +            if ( v == curr )
>>> +                /*
>>> +                 * Propagate the value to hardware, as it won't be context
>>> +                 * switched on vmentry.
>>> +                 */
>>
>> I have to admit that I find "on vmentry" in the comment misleading: Reading
>> it I first thought you're still alluding to the old model. Plus I also find
>> the combination of "context switched" and "on vmentry" problematic, as we
>> generally mean something else when we say "context switch".
> 
> I had a hard time wording this, because of the Xen/guest vs vCPU
> context switches.
> 
> What about:
> 
> "Propagate the value to hardware, as it won't we set on guest resume
> path."

Sounds better, thanks (with s/we/be/).

>>> +                goto set_reg;
>>
>> It's not clear why you want to use hvm_set_reg() in the first place - the
>> comment says "propagate to hardware", which would mean wrmsrl() in the
>> usual case. Here it would mean a direct call to amd_set_ssbd() imo. That
>> would then also be in line with all other "v == curr" conditionals, none
>> of which apply to any "goto set_reg". ..._set_reg(), aiui, is meant only
>> for use in cases where vCPU state needs updating such that proper state
>> would be loaded later (e.g. during VM entry).
> 
> I thought it was better to hide those vendor specific calls in the
> already existing vendor hooks (set_reg).  I don't mind calling
> amd_set_ssbd() directly here if that's preferred, it seemed kind of a
> layering violation when we have vendor specific hooks in place.

Well, Andrew of course should correct me if I'm wrong, but my understanding
of the get/set-reg interface is as described. On which grounds I don't see
any layering violation here - doing the call right here is merely a more
involved flavor of wrmsrl().

Jan
Roger Pau Monné Nov. 3, 2022, 8:52 a.m. UTC | #4
On Thu, Nov 03, 2022 at 09:09:41AM +0100, Jan Beulich wrote:
> On 02.11.2022 18:38, Roger Pau Monné wrote:
> > On Wed, Nov 02, 2022 at 12:49:17PM +0100, Jan Beulich wrote:
> >> On 29.10.2022 15:12, Roger Pau Monne wrote:
> >>> --- a/xen/arch/x86/hvm/svm/svm.c
> >>> +++ b/xen/arch/x86/hvm/svm/svm.c
> >>> @@ -973,6 +973,16 @@ static void cf_check svm_ctxt_switch_from(struct vcpu *v)
> >>>  
> >>>      /* Resume use of ISTs now that the host TR is reinstated. */
> >>>      enable_each_ist(idt_tables[cpu]);
> >>> +
> >>> +    /*
> >>> +     * Clear previous guest selection of SSBD if set.  Note that SPEC_CTRL.SSBD
> >>> +     * is already cleared by svm_vmexit_spec_ctrl.
> >>> +     */
> >>> +    if ( v->arch.msrs->virt_spec_ctrl.raw & SPEC_CTRL_SSBD )
> >>> +    {
> >>> +        ASSERT(v->domain->arch.cpuid->extd.virt_ssbd);
> >>> +        amd_set_ssbd(false);
> >>> +    }
> >>>  }
> >>
> >> Aren't you potentially turning off SSBD here just to ...
> >>
> >>> @@ -1000,6 +1010,13 @@ static void cf_check svm_ctxt_switch_to(struct vcpu *v)
> >>>  
> >>>      if ( cpu_has_msr_tsc_aux )
> >>>          wrmsr_tsc_aux(v->arch.msrs->tsc_aux);
> >>> +
> >>> +    /* Load SSBD if set by the guest. */
> >>> +    if ( v->arch.msrs->virt_spec_ctrl.raw & SPEC_CTRL_SSBD )
> >>> +    {
> >>> +        ASSERT(v->domain->arch.cpuid->extd.virt_ssbd);
> >>> +        amd_set_ssbd(true);
> >>> +    }
> >>>  }
> >>
> >> ... turn it on here again? IOW wouldn't switching better be isolated to
> >> just svm_ctxt_switch_to(), doing nothing if already in the intended mode?
> > 
> > What if we switch from a HVM vCPU into a PV one?  AFAICT then
> > svm_ctxt_switch_to() won't get called and we would be running the PV
> > guest with the previous HVM domain SSBD selection.
> 
> Would that be a problem? Or in other words: What is the intended behavior
> for PV? PV domains can control SSBD via SPEC_CTRL (only), so all we need
> to guarantee is that we respect their choice there.

If the hardware only supports non-architectural way (LS_CFG) or
VIRT_SPEC_CTRL to set SSBD then PV guests won't be able to change the
setting inherited from a previously running HVM guest. IMO it's fine
to run Xen code with the guest selection of SSBD, but carrying such
selection (ie: SSBD set) across guest context switches will be a too
big penalty.

> >>> --- a/xen/arch/x86/msr.c
> >>> +++ b/xen/arch/x86/msr.c
> >>> @@ -697,7 +697,15 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
> >>>                  msrs->spec_ctrl.raw &= ~SPEC_CTRL_SSBD;
> >>>          }
> >>>          else
> >>> +        {
> >>>              msrs->virt_spec_ctrl.raw = val & SPEC_CTRL_SSBD;
> >>> +            if ( v == curr )
> >>> +                /*
> >>> +                 * Propagate the value to hardware, as it won't be context
> >>> +                 * switched on vmentry.
> >>> +                 */
> >>
> >> I have to admit that I find "on vmentry" in the comment misleading: Reading
> >> it I first thought you're still alluding to the old model. Plus I also find
> >> the combination of "context switched" and "on vmentry" problematic, as we
> >> generally mean something else when we say "context switch".
> > 
> > I had a hard time wording this, because of the Xen/guest vs vCPU
> > context switches.
> > 
> > What about:
> > 
> > "Propagate the value to hardware, as it won't we set on guest resume
> > path."
> 
> Sounds better, thanks (with s/we/be/).

Oh, yes, sorry.

> 
> >>> +                goto set_reg;
> >>
> >> It's not clear why you want to use hvm_set_reg() in the first place - the
> >> comment says "propagate to hardware", which would mean wrmsrl() in the
> >> usual case. Here it would mean a direct call to amd_set_ssbd() imo. That
> >> would then also be in line with all other "v == curr" conditionals, none
> >> of which apply to any "goto set_reg". ..._set_reg(), aiui, is meant only
> >> for use in cases where vCPU state needs updating such that proper state
> >> would be loaded later (e.g. during VM entry).
> > 
> > I thought it was better to hide those vendor specific calls in the
> > already existing vendor hooks (set_reg).  I don't mind calling
> > amd_set_ssbd() directly here if that's preferred, it seemed kind of a
> > layering violation when we have vendor specific hooks in place.
> 
> Well, Andrew of course should correct me if I'm wrong, but my understanding
> of the get/set-reg interface is as described. On which grounds I don't see
> any layering violation here - doing the call right here is merely a more
> involved flavor of wrmsrl().

OK, will change, but first we need an agreement on the SSBD context
switch comment.

Thanks, Roger.
Jan Beulich Nov. 3, 2022, 9:01 a.m. UTC | #5
On 03.11.2022 09:52, Roger Pau Monné wrote:
> On Thu, Nov 03, 2022 at 09:09:41AM +0100, Jan Beulich wrote:
>> On 02.11.2022 18:38, Roger Pau Monné wrote:
>>> On Wed, Nov 02, 2022 at 12:49:17PM +0100, Jan Beulich wrote:
>>>> On 29.10.2022 15:12, Roger Pau Monne wrote:
>>>>> --- a/xen/arch/x86/hvm/svm/svm.c
>>>>> +++ b/xen/arch/x86/hvm/svm/svm.c
>>>>> @@ -973,6 +973,16 @@ static void cf_check svm_ctxt_switch_from(struct vcpu *v)
>>>>>  
>>>>>      /* Resume use of ISTs now that the host TR is reinstated. */
>>>>>      enable_each_ist(idt_tables[cpu]);
>>>>> +
>>>>> +    /*
>>>>> +     * Clear previous guest selection of SSBD if set.  Note that SPEC_CTRL.SSBD
>>>>> +     * is already cleared by svm_vmexit_spec_ctrl.
>>>>> +     */
>>>>> +    if ( v->arch.msrs->virt_spec_ctrl.raw & SPEC_CTRL_SSBD )
>>>>> +    {
>>>>> +        ASSERT(v->domain->arch.cpuid->extd.virt_ssbd);
>>>>> +        amd_set_ssbd(false);
>>>>> +    }
>>>>>  }
>>>>
>>>> Aren't you potentially turning off SSBD here just to ...
>>>>
>>>>> @@ -1000,6 +1010,13 @@ static void cf_check svm_ctxt_switch_to(struct vcpu *v)
>>>>>  
>>>>>      if ( cpu_has_msr_tsc_aux )
>>>>>          wrmsr_tsc_aux(v->arch.msrs->tsc_aux);
>>>>> +
>>>>> +    /* Load SSBD if set by the guest. */
>>>>> +    if ( v->arch.msrs->virt_spec_ctrl.raw & SPEC_CTRL_SSBD )
>>>>> +    {
>>>>> +        ASSERT(v->domain->arch.cpuid->extd.virt_ssbd);
>>>>> +        amd_set_ssbd(true);
>>>>> +    }
>>>>>  }
>>>>
>>>> ... turn it on here again? IOW wouldn't switching better be isolated to
>>>> just svm_ctxt_switch_to(), doing nothing if already in the intended mode?
>>>
>>> What if we switch from a HVM vCPU into a PV one?  AFAICT then
>>> svm_ctxt_switch_to() won't get called and we would be running the PV
>>> guest with the previous HVM domain SSBD selection.
>>
>> Would that be a problem? Or in other words: What is the intended behavior
>> for PV? PV domains can control SSBD via SPEC_CTRL (only), so all we need
>> to guarantee is that we respect their choice there.
> 
> If the hardware only supports non-architectural way (LS_CFG) or
> VIRT_SPEC_CTRL to set SSBD then PV guests won't be able to change the
> setting inherited from a previously running HVM guest. IMO it's fine
> to run Xen code with the guest selection of SSBD, but carrying such
> selection (ie: SSBD set) across guest context switches will be a too
> big penalty.

Hmm, perhaps. Question then is whether to better turn it off from
paravirt_ctxt_switch_to() (which would take care of the idle domain as
well, if we want it off there rather than considering the idle domain
as "Xen context"). Or, yet another option, don't use
*_ctxt_switch_{from,to}() at all but invoke it directly from
__context_switch().

Jan
Roger Pau Monné Nov. 3, 2022, 10:36 a.m. UTC | #6
On Thu, Nov 03, 2022 at 10:01:49AM +0100, Jan Beulich wrote:
> On 03.11.2022 09:52, Roger Pau Monné wrote:
> > On Thu, Nov 03, 2022 at 09:09:41AM +0100, Jan Beulich wrote:
> >> On 02.11.2022 18:38, Roger Pau Monné wrote:
> >>> On Wed, Nov 02, 2022 at 12:49:17PM +0100, Jan Beulich wrote:
> >>>> On 29.10.2022 15:12, Roger Pau Monne wrote:
> >>>>> --- a/xen/arch/x86/hvm/svm/svm.c
> >>>>> +++ b/xen/arch/x86/hvm/svm/svm.c
> >>>>> @@ -973,6 +973,16 @@ static void cf_check svm_ctxt_switch_from(struct vcpu *v)
> >>>>>  
> >>>>>      /* Resume use of ISTs now that the host TR is reinstated. */
> >>>>>      enable_each_ist(idt_tables[cpu]);
> >>>>> +
> >>>>> +    /*
> >>>>> +     * Clear previous guest selection of SSBD if set.  Note that SPEC_CTRL.SSBD
> >>>>> +     * is already cleared by svm_vmexit_spec_ctrl.
> >>>>> +     */
> >>>>> +    if ( v->arch.msrs->virt_spec_ctrl.raw & SPEC_CTRL_SSBD )
> >>>>> +    {
> >>>>> +        ASSERT(v->domain->arch.cpuid->extd.virt_ssbd);
> >>>>> +        amd_set_ssbd(false);
> >>>>> +    }
> >>>>>  }
> >>>>
> >>>> Aren't you potentially turning off SSBD here just to ...
> >>>>
> >>>>> @@ -1000,6 +1010,13 @@ static void cf_check svm_ctxt_switch_to(struct vcpu *v)
> >>>>>  
> >>>>>      if ( cpu_has_msr_tsc_aux )
> >>>>>          wrmsr_tsc_aux(v->arch.msrs->tsc_aux);
> >>>>> +
> >>>>> +    /* Load SSBD if set by the guest. */
> >>>>> +    if ( v->arch.msrs->virt_spec_ctrl.raw & SPEC_CTRL_SSBD )
> >>>>> +    {
> >>>>> +        ASSERT(v->domain->arch.cpuid->extd.virt_ssbd);
> >>>>> +        amd_set_ssbd(true);
> >>>>> +    }
> >>>>>  }
> >>>>
> >>>> ... turn it on here again? IOW wouldn't switching better be isolated to
> >>>> just svm_ctxt_switch_to(), doing nothing if already in the intended mode?
> >>>
> >>> What if we switch from a HVM vCPU into a PV one?  AFAICT then
> >>> svm_ctxt_switch_to() won't get called and we would be running the PV
> >>> guest with the previous HVM domain SSBD selection.
> >>
> >> Would that be a problem? Or in other words: What is the intended behavior
> >> for PV? PV domains can control SSBD via SPEC_CTRL (only), so all we need
> >> to guarantee is that we respect their choice there.
> > 
> > If the hardware only supports non-architectural way (LS_CFG) or
> > VIRT_SPEC_CTRL to set SSBD then PV guests won't be able to change the
> > setting inherited from a previously running HVM guest. IMO it's fine
> > to run Xen code with the guest selection of SSBD, but carrying such
> > selection (ie: SSBD set) across guest context switches will be a too
> > big penalty.
> 
> Hmm, perhaps. Question then is whether to better turn it off from
> paravirt_ctxt_switch_to() (which would take care of the idle domain as
> well, if we want it off there rather than considering the idle domain
> as "Xen context"). Or, yet another option, don't use
> *_ctxt_switch_{from,to}() at all but invoke it directly from
> __context_switch().

I consider it fine to run the idle domain with the guest SSBD
selection, or else switching to/from the idle domain could cause
toggling of SSBD which is an unneeded penalty.

If there's an specific issue that needs dealing with I'm happy to
adjust, otherwise I think the proposed approach is an acceptable
compromise to avoid excessive toggling of SSBD when not using
SPEC_CTRL.

Thanks, Roger.
Jan Beulich Nov. 3, 2022, 2:05 p.m. UTC | #7
On 03.11.2022 11:36, Roger Pau Monné wrote:
> On Thu, Nov 03, 2022 at 10:01:49AM +0100, Jan Beulich wrote:
>> On 03.11.2022 09:52, Roger Pau Monné wrote:
>>> On Thu, Nov 03, 2022 at 09:09:41AM +0100, Jan Beulich wrote:
>>>> On 02.11.2022 18:38, Roger Pau Monné wrote:
>>>>> On Wed, Nov 02, 2022 at 12:49:17PM +0100, Jan Beulich wrote:
>>>>>> On 29.10.2022 15:12, Roger Pau Monne wrote:
>>>>>>> --- a/xen/arch/x86/hvm/svm/svm.c
>>>>>>> +++ b/xen/arch/x86/hvm/svm/svm.c
>>>>>>> @@ -973,6 +973,16 @@ static void cf_check svm_ctxt_switch_from(struct vcpu *v)
>>>>>>>  
>>>>>>>      /* Resume use of ISTs now that the host TR is reinstated. */
>>>>>>>      enable_each_ist(idt_tables[cpu]);
>>>>>>> +
>>>>>>> +    /*
>>>>>>> +     * Clear previous guest selection of SSBD if set.  Note that SPEC_CTRL.SSBD
>>>>>>> +     * is already cleared by svm_vmexit_spec_ctrl.
>>>>>>> +     */
>>>>>>> +    if ( v->arch.msrs->virt_spec_ctrl.raw & SPEC_CTRL_SSBD )
>>>>>>> +    {
>>>>>>> +        ASSERT(v->domain->arch.cpuid->extd.virt_ssbd);
>>>>>>> +        amd_set_ssbd(false);
>>>>>>> +    }
>>>>>>>  }
>>>>>>
>>>>>> Aren't you potentially turning off SSBD here just to ...
>>>>>>
>>>>>>> @@ -1000,6 +1010,13 @@ static void cf_check svm_ctxt_switch_to(struct vcpu *v)
>>>>>>>  
>>>>>>>      if ( cpu_has_msr_tsc_aux )
>>>>>>>          wrmsr_tsc_aux(v->arch.msrs->tsc_aux);
>>>>>>> +
>>>>>>> +    /* Load SSBD if set by the guest. */
>>>>>>> +    if ( v->arch.msrs->virt_spec_ctrl.raw & SPEC_CTRL_SSBD )
>>>>>>> +    {
>>>>>>> +        ASSERT(v->domain->arch.cpuid->extd.virt_ssbd);
>>>>>>> +        amd_set_ssbd(true);
>>>>>>> +    }
>>>>>>>  }
>>>>>>
>>>>>> ... turn it on here again? IOW wouldn't switching better be isolated to
>>>>>> just svm_ctxt_switch_to(), doing nothing if already in the intended mode?
>>>>>
>>>>> What if we switch from a HVM vCPU into a PV one?  AFAICT then
>>>>> svm_ctxt_switch_to() won't get called and we would be running the PV
>>>>> guest with the previous HVM domain SSBD selection.
>>>>
>>>> Would that be a problem? Or in other words: What is the intended behavior
>>>> for PV? PV domains can control SSBD via SPEC_CTRL (only), so all we need
>>>> to guarantee is that we respect their choice there.
>>>
>>> If the hardware only supports non-architectural way (LS_CFG) or
>>> VIRT_SPEC_CTRL to set SSBD then PV guests won't be able to change the
>>> setting inherited from a previously running HVM guest. IMO it's fine
>>> to run Xen code with the guest selection of SSBD, but carrying such
>>> selection (ie: SSBD set) across guest context switches will be a too
>>> big penalty.
>>
>> Hmm, perhaps. Question then is whether to better turn it off from
>> paravirt_ctxt_switch_to() (which would take care of the idle domain as
>> well, if we want it off there rather than considering the idle domain
>> as "Xen context"). Or, yet another option, don't use
>> *_ctxt_switch_{from,to}() at all but invoke it directly from
>> __context_switch().
> 
> I consider it fine to run the idle domain with the guest SSBD
> selection, or else switching to/from the idle domain could cause
> toggling of SSBD which is an unneeded penalty.
> 
> If there's an specific issue that needs dealing with I'm happy to
> adjust, otherwise I think the proposed approach is an acceptable
> compromise to avoid excessive toggling of SSBD when not using
> SPEC_CTRL.

Well, perhaps. What I was suggesting would further limit the toggling,
but I'm not going to insist on you going that route.

Jan
diff mbox series

Patch

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 0fbdcb574f..424b12cfb2 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -2372,10 +2372,12 @@  By default, Xen will use STIBP when IBRS is in use (IBRS implies STIBP), and
 when hardware hints recommend using it as a blanket setting.
 
 On hardware supporting SSBD (Speculative Store Bypass Disable), the `ssbd=`
-option can be used to force or prevent Xen using the feature itself.  On AMD
-hardware, this is a global option applied at boot, and not virtualised for
-guest use.  On Intel hardware, the feature is virtualised for guests,
-independently of Xen's choice of setting.
+option can be used to force or prevent Xen using the feature itself.  The
+feature is virtualised for guests, independently of Xen's choice of setting.
+On AMD hardware, disabling Xen SSBD usage on the command line (`ssbd=0` which
+is the default value) can lead to Xen running with the guest SSBD selection
+depending on hardware support, on the same hardware setting `ssbd=1` will
+result in SSBD always being enabled, regardless of guest choice.
 
 On hardware supporting PSFD (Predictive Store Forwarding Disable), the `psfd=`
 option can be used to force or prevent Xen using the feature itself.  By
diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 98c52d0686..05d72c6501 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -742,7 +742,7 @@  void amd_init_ssbd(const struct cpuinfo_x86 *c)
 }
 
 static struct ssbd_ls_cfg {
-    bool locked;
+    spinlock_t lock;
     unsigned int count;
 } __cacheline_aligned *ssbd_ls_cfg;
 static unsigned int __ro_after_init ssbd_max_cores;
@@ -753,7 +753,7 @@  bool __init amd_setup_legacy_ssbd(void)
 	unsigned int i;
 
 	if ((boot_cpu_data.x86 != 0x17 && boot_cpu_data.x86 != 0x18) ||
-	    boot_cpu_data.x86_num_siblings <= 1)
+	    boot_cpu_data.x86_num_siblings <= 1 || opt_ssbd)
 		return true;
 
 	/*
@@ -776,46 +776,51 @@  bool __init amd_setup_legacy_ssbd(void)
 	if (!ssbd_ls_cfg)
 		return false;
 
-	if (opt_ssbd)
-		for (i = 0; i < ssbd_max_cores * AMD_FAM17H_MAX_SOCKETS; i++)
-			/* Set initial state, applies to any (hotplug) CPU. */
-			ssbd_ls_cfg[i].count = boot_cpu_data.x86_num_siblings;
+	for (i = 0; i < ssbd_max_cores * AMD_FAM17H_MAX_SOCKETS; i++)
+		spin_lock_init(&ssbd_ls_cfg[i].lock);
 
 	return true;
 }
 
-/*
- * Executed from GIF==0 context: avoid using BUG/ASSERT or other functionality
- * that relies on exceptions as those are not expected to run in GIF==0
- * context.
- */
-void amd_set_legacy_ssbd(bool enable)
+static void core_set_legacy_ssbd(bool enable)
 {
 	const struct cpuinfo_x86 *c = &current_cpu_data;
 	struct ssbd_ls_cfg *status;
+	unsigned long flags;
 
 	if ((c->x86 != 0x17 && c->x86 != 0x18) || c->x86_num_siblings <= 1) {
-		set_legacy_ssbd(c, enable);
+		BUG_ON(!set_legacy_ssbd(c, enable));
 		return;
 	}
 
+	BUG_ON(c->phys_proc_id >= AMD_FAM17H_MAX_SOCKETS);
+	BUG_ON(c->cpu_core_id >= ssbd_max_cores);
 	status = &ssbd_ls_cfg[c->phys_proc_id * ssbd_max_cores +
 	                      c->cpu_core_id];
 
-	/*
-	 * Open code a very simple spinlock: this function is used with GIF==0
-	 * and different IF values, so would trigger the checklock detector.
-	 * Instead of trying to workaround the detector, use a very simple lock
-	 * implementation: it's better to reduce the amount of code executed
-	 * with GIF==0.
-	 */
-	while (test_and_set_bool(status->locked))
-		cpu_relax();
+	spin_lock_irqsave(&status->lock, flags);
 	status->count += enable ? 1 : -1;
+	ASSERT(status->count <= c->x86_num_siblings);
 	if (enable ? status->count == 1 : !status->count)
-		set_legacy_ssbd(c, enable);
-	barrier();
-	write_atomic(&status->locked, false);
+		BUG_ON(!set_legacy_ssbd(c, enable));
+	spin_unlock_irqrestore(&status->lock, flags);
+}
+
+void amd_set_ssbd(bool enable)
+{
+	if (opt_ssbd)
+		/*
+		 * Ignore attempts to turn off SSBD, it's hardcoded on the
+		 * command line.
+		 */
+		return;
+
+	if (cpu_has_virt_ssbd)
+		wrmsr(MSR_VIRT_SPEC_CTRL, enable ? SPEC_CTRL_SSBD : 0, 0);
+	else if (amd_legacy_ssbd)
+		core_set_legacy_ssbd(enable);
+	else
+		ASSERT_UNREACHABLE();
 }
 
 /*
diff --git a/xen/arch/x86/hvm/svm/entry.S b/xen/arch/x86/hvm/svm/entry.S
index a26589aa9a..981cd82e7c 100644
--- a/xen/arch/x86/hvm/svm/entry.S
+++ b/xen/arch/x86/hvm/svm/entry.S
@@ -59,9 +59,6 @@  __UNLIKELY_END(nsvm_hap)
 
         clgi
 
-        ALTERNATIVE "", STR(call vmentry_virt_spec_ctrl), \
-                        X86_FEATURE_VIRT_SC_MSR_HVM
-
         /* WARNING! `ret`, `call *`, `jmp *` not safe beyond this point. */
         /* SPEC_CTRL_EXIT_TO_SVM       Req: b=curr %rsp=regs/cpuinfo, Clob: acd */
         .macro svm_vmentry_spec_ctrl
@@ -131,9 +128,6 @@  __UNLIKELY_END(nsvm_hap)
         ALTERNATIVE "", svm_vmexit_spec_ctrl, X86_FEATURE_SC_MSR_HVM
         /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
 
-        ALTERNATIVE "", STR(call vmexit_virt_spec_ctrl), \
-                        X86_FEATURE_VIRT_SC_MSR_HVM
-
         /*
          * STGI is executed unconditionally, and is sufficiently serialising
          * to safely resolve any Spectre-v1 concerns in the above logic.
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 1aeaabcb13..b2f147c11b 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -973,6 +973,16 @@  static void cf_check svm_ctxt_switch_from(struct vcpu *v)
 
     /* Resume use of ISTs now that the host TR is reinstated. */
     enable_each_ist(idt_tables[cpu]);
+
+    /*
+     * Clear previous guest selection of SSBD if set.  Note that SPEC_CTRL.SSBD
+     * is already cleared by svm_vmexit_spec_ctrl.
+     */
+    if ( v->arch.msrs->virt_spec_ctrl.raw & SPEC_CTRL_SSBD )
+    {
+        ASSERT(v->domain->arch.cpuid->extd.virt_ssbd);
+        amd_set_ssbd(false);
+    }
 }
 
 static void cf_check svm_ctxt_switch_to(struct vcpu *v)
@@ -1000,6 +1010,13 @@  static void cf_check svm_ctxt_switch_to(struct vcpu *v)
 
     if ( cpu_has_msr_tsc_aux )
         wrmsr_tsc_aux(v->arch.msrs->tsc_aux);
+
+    /* Load SSBD if set by the guest. */
+    if ( v->arch.msrs->virt_spec_ctrl.raw & SPEC_CTRL_SSBD )
+    {
+        ASSERT(v->domain->arch.cpuid->extd.virt_ssbd);
+        amd_set_ssbd(true);
+    }
 }
 
 static void noreturn cf_check svm_do_resume(void)
@@ -2518,6 +2535,10 @@  static void cf_check svm_set_reg(struct vcpu *v, unsigned int reg, uint64_t val)
         vmcb->spec_ctrl = val;
         break;
 
+    case MSR_VIRT_SPEC_CTRL:
+        amd_set_ssbd(val & SPEC_CTRL_SSBD);
+        break;
+
     default:
         printk(XENLOG_G_ERR "%s(%pv, 0x%08x, 0x%016"PRIx64") Bad register\n",
                __func__, v, reg, val);
@@ -3116,34 +3137,6 @@  void svm_vmexit_handler(struct cpu_user_regs *regs)
     vmcb_set_vintr(vmcb, intr);
 }
 
-/* Called with GIF=0. */
-void vmexit_virt_spec_ctrl(void)
-{
-    unsigned int val = opt_ssbd ? SPEC_CTRL_SSBD : 0;
-
-    if ( val == current->arch.msrs->virt_spec_ctrl.raw )
-        return;
-
-    if ( cpu_has_virt_ssbd )
-        wrmsr(MSR_VIRT_SPEC_CTRL, val, 0);
-    else
-        amd_set_legacy_ssbd(val);
-}
-
-/* Called with GIF=0. */
-void vmentry_virt_spec_ctrl(void)
-{
-    unsigned int val = current->arch.msrs->virt_spec_ctrl.raw;
-
-    if ( val == (opt_ssbd ? SPEC_CTRL_SSBD : 0) )
-        return;
-
-    if ( cpu_has_virt_ssbd )
-        wrmsr(MSR_VIRT_SPEC_CTRL, val, 0);
-    else
-        amd_set_legacy_ssbd(val);
-}
-
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/include/asm/amd.h b/xen/arch/x86/include/asm/amd.h
index 6a42f68542..81ed71710f 100644
--- a/xen/arch/x86/include/asm/amd.h
+++ b/xen/arch/x86/include/asm/amd.h
@@ -153,6 +153,6 @@  void amd_check_disable_c1e(unsigned int port, u8 value);
 
 extern bool amd_legacy_ssbd;
 bool amd_setup_legacy_ssbd(void);
-void amd_set_legacy_ssbd(bool enable);
+void amd_set_ssbd(bool enable);
 
 #endif /* __AMD_H__ */
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 95416995a5..d15185cd48 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -697,7 +697,15 @@  int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
                 msrs->spec_ctrl.raw &= ~SPEC_CTRL_SSBD;
         }
         else
+        {
             msrs->virt_spec_ctrl.raw = val & SPEC_CTRL_SSBD;
+            if ( v == curr )
+                /*
+                 * Propagate the value to hardware, as it won't be context
+                 * switched on vmentry.
+                 */
+                goto set_reg;
+        }
         break;
 
     case MSR_AMD64_DE_CFG: