x86/hvm/viridian: fix the TLB flush hypercall

Message ID	1458133226-1808-1-git-send-email-paul.durrant@citrix.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <xen-devel-bounces@lists.xen.org> From: Paul Durrant <paul.durrant@citrix.com> To: <xen-devel@lists.xenproject.org> Date: Wed, 16 Mar 2016 13:00:26 +0000 Message-ID: <1458133226-1808-1-git-send-email-paul.durrant@citrix.com> MIME-Version: 1.0 Cc: Andrew Cooper <andrew.cooper3@citrix.com>, Paul Durrant <paul.durrant@citrix.com>, Keir Fraser <keir@xen.org>, Jan Beulich <jbeulich@suse.com> Subject: [Xen-devel] [PATCH] x86/hvm/viridian: fix the TLB flush hypercall Precedence: list Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" <xen-devel-bounces@lists.xen.org>

Paul Durrant March 16, 2016, 1 p.m. UTC

Commit b38d426a "flush remote tlbs by hypercall" add support to allow
Windows to request flush of remote TLB via hypercall rather than IPI.
Unfortunately it seems that this code was broken in a couple of ways:

1) The allocation of the per-vcpu flush mask is gated on whether the
   domain has viridian features enabled but the call to allocate is
   made before the toolstack has enabled those features. This results
   in a NULL pointer dereference.

2) One of the flush hypercall variants is a rep op, but the code
   does not update the output data with the reps completed. Hence the
   guest will spin repeatedly making the hypercall because it believes
   it has uncompleted reps.

This patch fixes both of these issues and also adds a check to make
sure the current vCPU is not included in the flush mask (since there's
clearly no need for the CPU to IPI itself).

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/hvm.c      | 12 ++++--------
 xen/arch/x86/hvm/viridian.c |  4 +++-
 2 files changed, 7 insertions(+), 9 deletions(-)

Andrew Cooper March 16, 2016, 1:20 p.m. UTC | #1

On 16/03/16 13:00, Paul Durrant wrote:
> Commit b38d426a "flush remote tlbs by hypercall" add support to allow
> Windows to request flush of remote TLB via hypercall rather than IPI.
> Unfortunately it seems that this code was broken in a couple of ways:
>
> 1) The allocation of the per-vcpu flush mask is gated on whether the
>    domain has viridian features enabled but the call to allocate is
>    made before the toolstack has enabled those features. This results
>    in a NULL pointer dereference.
>
> 2) One of the flush hypercall variants is a rep op, but the code
>    does not update the output data with the reps completed. Hence the
>    guest will spin repeatedly making the hypercall because it believes
>    it has uncompleted reps.
>
> This patch fixes both of these issues and also adds a check to make
> sure the current vCPU is not included in the flush mask (since there's
> clearly no need for the CPU to IPI itself).

Thinking more about this, the asid flush does serve properly for TLB
flushing.  Why do we then subsequently use flush_tlb_mask(), as opposed
to a less heavyweight alternative like smp_send_event_check_mask() ?

~Andrew

Paul Durrant March 16, 2016, 1:25 p.m. UTC | #2

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 16 March 2016 13:20
> To: Paul Durrant; xen-devel@lists.xenproject.org
> Cc: Keir (Xen.org); Jan Beulich
> Subject: Re: [PATCH] x86/hvm/viridian: fix the TLB flush hypercall
> 
> On 16/03/16 13:00, Paul Durrant wrote:
> > Commit b38d426a "flush remote tlbs by hypercall" add support to allow
> > Windows to request flush of remote TLB via hypercall rather than IPI.
> > Unfortunately it seems that this code was broken in a couple of ways:
> >
> > 1) The allocation of the per-vcpu flush mask is gated on whether the
> >    domain has viridian features enabled but the call to allocate is
> >    made before the toolstack has enabled those features. This results
> >    in a NULL pointer dereference.
> >
> > 2) One of the flush hypercall variants is a rep op, but the code
> >    does not update the output data with the reps completed. Hence the
> >    guest will spin repeatedly making the hypercall because it believes
> >    it has uncompleted reps.
> >
> > This patch fixes both of these issues and also adds a check to make
> > sure the current vCPU is not included in the flush mask (since there's
> > clearly no need for the CPU to IPI itself).
> 
> Thinking more about this, the asid flush does serve properly for TLB
> flushing.  Why do we then subsequently use flush_tlb_mask(), as opposed
> to a less heavyweight alternative like smp_send_event_check_mask() ?
> 

Yes, all I need is to force the CPUs out of non-root mode so that will serve the purpose. V2 coming up.

  Paul

> ~Andrew

Jan Beulich March 16, 2016, 1:31 p.m. UTC | #3

>>> On 16.03.16 at 14:00, <paul.durrant@citrix.com> wrote:
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2576,12 +2576,9 @@ int hvm_vcpu_initialise(struct vcpu *v)
>      if ( rc != 0 )
>          goto fail6;
>  
> -    if ( is_viridian_domain(d) )
> -    {
> -        rc = viridian_vcpu_init(v);
> -        if ( rc != 0 )
> -            goto fail7;
> -    }
> +    rc = viridian_vcpu_init(v);
> +    if ( rc != 0 )
> +        goto fail7;

Well, it's only a CPU mask (i.e. right now at most 512 bytes), but
anyway - why can't this be done conditionally here as well as
when HVM_PARAM_VIRIDIAN gets set to a non-zero value? I
know you are of the opinion that the Viridian flag could (should)
always be set for HVM guests, but you also know that I disagree
(and Don't VMware patches, should they ever go in, would
support me, since iirc VMware and Viridian are exclusive of one
another).

That said, I now wonder anyway why this is a per-vCPU mask
instead of a per-pCPU one: There's no need for every vCPU in
the system to have its own afaics.

> @@ -658,6 +658,8 @@ int viridian_hypercall(struct cpu_user_regs *regs)
>          if ( !cpumask_empty(pcpu_mask) )
>              flush_tlb_mask(pcpu_mask);
>  
> +        output.rep_complete = input.rep_count;

So why would this not be necessary for the other hypercalls?
And what does a repeat count mean here in the first place?
Surely there isn't any expectation to do more then one flush?

Jan

Andrew Cooper March 16, 2016, 1:37 p.m. UTC | #4

On 16/03/16 13:31, Jan Beulich wrote:
>
> That said, I now wonder anyway why this is a per-vCPU mask
> instead of a per-pCPU one: There's no need for every vCPU in
> the system to have its own afaics.

If every vcpu makes a viridian hypercall at the same time, Xen would end
up clobbering same mask while trying to serve the hypercalls.

~Andrew

Paul Durrant March 16, 2016, 1:38 p.m. UTC | #5

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 16 March 2016 13:32
> To: Paul Durrant
> Cc: Andrew Cooper; xen-devel@lists.xenproject.org; Keir (Xen.org)
> Subject: Re: [PATCH] x86/hvm/viridian: fix the TLB flush hypercall
> 
> >>> On 16.03.16 at 14:00, <paul.durrant@citrix.com> wrote:
> > --- a/xen/arch/x86/hvm/hvm.c
> > +++ b/xen/arch/x86/hvm/hvm.c
> > @@ -2576,12 +2576,9 @@ int hvm_vcpu_initialise(struct vcpu *v)
> >      if ( rc != 0 )
> >          goto fail6;
> >
> > -    if ( is_viridian_domain(d) )
> > -    {
> > -        rc = viridian_vcpu_init(v);
> > -        if ( rc != 0 )
> > -            goto fail7;
> > -    }
> > +    rc = viridian_vcpu_init(v);
> > +    if ( rc != 0 )
> > +        goto fail7;
> 
> Well, it's only a CPU mask (i.e. right now at most 512 bytes), but
> anyway - why can't this be done conditionally here as well as
> when HVM_PARAM_VIRIDIAN gets set to a non-zero value?

That would also work but as you say, it is only 512 bytes.

> I
> know you are of the opinion that the Viridian flag could (should)
> always be set for HVM guests, but you also know that I disagree
> (and Don't VMware patches, should they ever go in, would
> support me, since iirc VMware and Viridian are exclusive of one
> another).
> 
> That said, I now wonder anyway why this is a per-vCPU mask
> instead of a per-pCPU one: There's no need for every vCPU in
> the system to have its own afaics.
> 

Given that it only needs to be valid during the hypercall then yes, a per-pCPU would be sufficient.

> > @@ -658,6 +658,8 @@ int viridian_hypercall(struct cpu_user_regs *regs)
> >          if ( !cpumask_empty(pcpu_mask) )
> >              flush_tlb_mask(pcpu_mask);
> >
> > +        output.rep_complete = input.rep_count;
> 
> So why would this not be necessary for the other hypercalls?

As far as I can tell from my spec. (which Microsoft have helpfully removed from MSDN now) HvFlushVirtualAddressList is the only hypercall of the 3 we now support that is a rep op.

> And what does a repeat count mean here in the first place?
> Surely there isn't any expectation to do more then one flush?

HvFlushVirtualAddressList specifies a list of individual VA ranges to flush. We can't deal with that level of granularity so yes, we only do a single flush.

  Paul

> 
> Jan

Paul Durrant March 16, 2016, 1:39 p.m. UTC | #6

> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: 16 March 2016 13:37
> To: Jan Beulich; Paul Durrant
> Cc: xen-devel@lists.xenproject.org; Keir (Xen.org)
> Subject: Re: [PATCH] x86/hvm/viridian: fix the TLB flush hypercall
> 
> On 16/03/16 13:31, Jan Beulich wrote:
> >
> > That said, I now wonder anyway why this is a per-vCPU mask
> > instead of a per-pCPU one: There's no need for every vCPU in
> > the system to have its own afaics.
> 
> If every vcpu makes a viridian hypercall at the same time, Xen would end
> up clobbering same mask while trying to serve the hypercalls.
> 

Only if the hypercalls could be pre-empted during execution, which I don't think is the case, is it?

  Paul

> ~Andrew

x86/hvm/viridian: fix the TLB flush hypercall

Commit Message

Comments

Patch