[v9,12/28] ARM: vGIC: advertise LPI support

Message ID	20170511175340.8448-13-andre.przywara@arm.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <xen-devel-bounces@lists.xen.org> From: Andre Przywara <andre.przywara@arm.com> To: Julien Grall <julien.grall@arm.com>, Stefano Stabellini <sstabellini@kernel.org> Date: Thu, 11 May 2017 18:53:24 +0100 Message-Id: <20170511175340.8448-13-andre.przywara@arm.com> In-Reply-To: <20170511175340.8448-1-andre.przywara@arm.com> References: <20170511175340.8448-1-andre.przywara@arm.com> Cc: xen-devel@lists.xenproject.org, Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>, Vijay Kilari <vijay.kilari@gmail.com>, Shanker Donthineni <shankerd@codeaurora.org> Subject: [Xen-devel] [PATCH v9 12/28] ARM: vGIC: advertise LPI support Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" <xen-devel-bounces@lists.xen.org>

Andre Przywara May 11, 2017, 5:53 p.m. UTC

To let a guest know about the availability of virtual LPIs, set the
respective bits in the virtual GIC registers and let a guest control
the LPI enable bit.
Only report the LPI capability if the host has initialized at least
one ITS.
This removes a "TBD" comment, as we now populate the processor number
in the GICR_TYPE register.
Advertise 24 bits worth of LPIs to the guest.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 65 insertions(+), 5 deletions(-)

Julien Grall May 16, 2017, 1:03 p.m. UTC | #1

Hi Andre,

On 11/05/17 18:53, Andre Przywara wrote:
> To let a guest know about the availability of virtual LPIs, set the
> respective bits in the virtual GIC registers and let a guest control
> the LPI enable bit.
> Only report the LPI capability if the host has initialized at least
> one ITS.
> This removes a "TBD" comment, as we now populate the processor number
> in the GICR_TYPE register.

s/GICR_TYPE/GICR_TYPER/

Also, I think it would be worth explaining that you populate 
GICR_TYPER.Process_Number because the ITS will use it later on.

> Advertise 24 bits worth of LPIs to the guest.

Again this is not valid anymore. You said you would drop it on the 
previous version. So why it has not been done?

>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/vgic-v3.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 65 insertions(+), 5 deletions(-)
>
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index 38c123c..6dbdb2e 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -170,8 +170,19 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>      switch ( gicr_reg )
>      {
>      case VREG32(GICR_CTLR):
> -        /* We have not implemented LPI's, read zero */
> -        goto read_as_zero_32;
> +    {
> +        unsigned long flags;
> +
> +        if ( !v->domain->arch.vgic.has_its )
> +            goto read_as_zero_32;
> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> +
> +        spin_lock_irqsave(&v->arch.vgic.lock, flags);
> +        *r = vgic_reg32_extract(!!(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED),
> +                                info);
> +        spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
> +        return 1;
> +    }
>
>      case VREG32(GICR_IIDR):
>          if ( dabt.size != DABT_WORD ) goto bad_width;
> @@ -183,16 +194,20 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>          uint64_t typer, aff;
>
>          if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> -        /* TBD: Update processor id in [23:8] when ITS support is added */
>          aff = (MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 3) << 56 |
>                 MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 2) << 48 |
>                 MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 1) << 40 |
>                 MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 0) << 32);
>          typer = aff;
> +        /* We use the VCPU ID as the redistributor ID in bits[23:8] */
> +        typer |= (v->vcpu_id & 0xffff) << 8;

Why the mask here? This sound like a bug to me if vcpu_id does not fit 
it and you would make it worst by the mask.

But this is already addressed by max_vcpus in the vgic_ops. So please 
drop the pointless mask.

Lastly, I would have expected to try to address my remark everywhere 
regarding hardcoding offset. In this case,

>
>          if ( v->arch.vgic.flags & VGIC_V3_RDIST_LAST )
>              typer |= GICR_TYPER_LAST;
>
> +        if ( v->domain->arch.vgic.has_its )
> +            typer |= GICR_TYPER_PLPIS;
> +
>          *r = vgic_reg64_extract(typer, info);
>
>          return 1;
> @@ -426,6 +441,28 @@ static uint64_t sanitize_pendbaser(uint64_t reg)
>      return reg;
>  }
>
> +static void vgic_vcpu_enable_lpis(struct vcpu *v)
> +{
> +    uint64_t reg = v->domain->arch.vgic.rdist_propbase;
> +    unsigned int nr_lpis = BIT((reg & 0x1f) + 1);
> +
> +    /* rdists_enabled is protected by the domain lock. */
> +    ASSERT(spin_is_locked(&v->domain->arch.vgic.lock));
> +
> +    if ( nr_lpis < LPI_OFFSET )
> +        nr_lpis = 0;
> +    else
> +        nr_lpis -= LPI_OFFSET;
> +
> +    if ( !v->domain->arch.vgic.rdists_enabled )
> +    {
> +        v->domain->arch.vgic.nr_lpis = nr_lpis;
> +        v->domain->arch.vgic.rdists_enabled = true;
> +    }
> +
> +    v->arch.vgic.flags |= VGIC_V3_LPIS_ENABLED;
> +}
> +
>  static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>                                            uint32_t gicr_reg,
>                                            register_t r)
> @@ -436,8 +473,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>      switch ( gicr_reg )
>      {
>      case VREG32(GICR_CTLR):
> -        /* LPI's not implemented */
> -        goto write_ignore_32;
> +    {
> +        unsigned long flags;
> +
> +        if ( !v->domain->arch.vgic.has_its )
> +            goto write_ignore_32;
> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> +
> +        vgic_lock(v);                   /* protects rdists_enabled */

Getting back to the locking. I don't see any place where we get the 
domain vgic lock before vCPU vgic lock. So this raises the question why 
this ordering and not moving this lock into vgic_vcpu_enable_lpis.

At least this require documentation in the code and explanation in the 
commit message.

> +        spin_lock_irqsave(&v->arch.vgic.lock, flags);
> +
> +        /* LPIs can only be enabled once, but never disabled again. */
> +        if ( (r & GICR_CTLR_ENABLE_LPIS) &&
> +             !(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED) )
> +            vgic_vcpu_enable_lpis(v);
> +
> +        spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
> +        vgic_unlock(v);
> +
> +        return 1;
> +    }
>
>      case VREG32(GICR_IIDR):
>          /* RO */
> @@ -1058,6 +1113,11 @@ static int vgic_v3_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
>          typer = ((ncpus - 1) << GICD_TYPE_CPUS_SHIFT |
>                   DIV_ROUND_UP(v->domain->arch.vgic.nr_spis, 32));
>
> +        if ( v->domain->arch.vgic.has_its )
> +        {
> +            typer |= GICD_TYPE_LPIS;
> +            irq_bits = v->domain->arch.vgic.intid_bits;
> +        }

As I said on the previous version, I would have expected the field 
intid_bits to be used even if ITS is not enabled.

The current code make very difficult to understand the purpose of 
intid_bits and know it is only used when ITS is enabled.

intid_bits should correctly be initialized in vgic_v3_domain_init and 
directly used it.

>          typer |= (irq_bits - 1) << GICD_TYPE_ID_BITS_SHIFT;
>
>          *r = vgic_reg32_extract(typer, info);
>

Cheers,

Stefano Stabellini May 22, 2017, 10:19 p.m. UTC | #2

On Tue, 16 May 2017, Julien Grall wrote:
> > @@ -436,8 +473,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu
> > *v, mmio_info_t *info,
> >      switch ( gicr_reg )
> >      {
> >      case VREG32(GICR_CTLR):
> > -        /* LPI's not implemented */
> > -        goto write_ignore_32;
> > +    {
> > +        unsigned long flags;
> > +
> > +        if ( !v->domain->arch.vgic.has_its )
> > +            goto write_ignore_32;
> > +        if ( dabt.size != DABT_WORD ) goto bad_width;
> > +
> > +        vgic_lock(v);                   /* protects rdists_enabled */
> 
> Getting back to the locking. I don't see any place where we get the domain
> vgic lock before vCPU vgic lock. So this raises the question why this ordering
> and not moving this lock into vgic_vcpu_enable_lpis.
> 
> At least this require documentation in the code and explanation in the commit
> message.

It doesn't look like we need to take the v->arch.vgic.lock here. What is
it protecting?


> > +        spin_lock_irqsave(&v->arch.vgic.lock, flags);
> > +
> > +        /* LPIs can only be enabled once, but never disabled again. */
> > +        if ( (r & GICR_CTLR_ENABLE_LPIS) &&
> > +             !(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED) )
> > +            vgic_vcpu_enable_lpis(v);
> > +
> > +        spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
> > +        vgic_unlock(v);
> > +
> > +        return 1;
> > +    }
> > 
> >      case VREG32(GICR_IIDR):
> >          /* RO */

Julien Grall May 23, 2017, 10:49 a.m. UTC | #3

Hi Stefano,

On 22/05/17 23:19, Stefano Stabellini wrote:
> On Tue, 16 May 2017, Julien Grall wrote:
>>> @@ -436,8 +473,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu
>>> *v, mmio_info_t *info,
>>>      switch ( gicr_reg )
>>>      {
>>>      case VREG32(GICR_CTLR):
>>> -        /* LPI's not implemented */
>>> -        goto write_ignore_32;
>>> +    {
>>> +        unsigned long flags;
>>> +
>>> +        if ( !v->domain->arch.vgic.has_its )
>>> +            goto write_ignore_32;
>>> +        if ( dabt.size != DABT_WORD ) goto bad_width;
>>> +
>>> +        vgic_lock(v);                   /* protects rdists_enabled */
>>
>> Getting back to the locking. I don't see any place where we get the domain
>> vgic lock before vCPU vgic lock. So this raises the question why this ordering
>> and not moving this lock into vgic_vcpu_enable_lpis.
>>
>> At least this require documentation in the code and explanation in the commit
>> message.
>
> It doesn't look like we need to take the v->arch.vgic.lock here. What is
> it protecting?

The name of the function is a bit confusion. It does not take the vCPU 
vgic lock but the domain vgic lock.

I believe the vcpu is passed to avoid have v->domain in most of the 
callers. But we should probably rename the function.

In this case it protects vgic_vcpu_enable_lpis because you can configure 
the number of LPIs per re-distributor but this is a domain wide value. I 
know the spec is confusing on this.

Cheers,

Andre Przywara May 23, 2017, 5:23 p.m. UTC | #4

Hi,

On 16/05/17 14:03, Julien Grall wrote:
> Hi Andre,
> 
> On 11/05/17 18:53, Andre Przywara wrote:
>> To let a guest know about the availability of virtual LPIs, set the
>> respective bits in the virtual GIC registers and let a guest control
>> the LPI enable bit.
>> Only report the LPI capability if the host has initialized at least
>> one ITS.
>> This removes a "TBD" comment, as we now populate the processor number
>> in the GICR_TYPE register.
> 
> s/GICR_TYPE/GICR_TYPER/
> 
> Also, I think it would be worth explaining that you populate
> GICR_TYPER.Process_Number because the ITS will use it later on.
> 
>> Advertise 24 bits worth of LPIs to the guest.
> 
> Again this is not valid anymore. You said you would drop it on the
> previous version. So why it has not been done?
> 
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/vgic-v3.c | 70
>> ++++++++++++++++++++++++++++++++++++++++++++++----
>>  1 file changed, 65 insertions(+), 5 deletions(-)
>>
>> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
>> index 38c123c..6dbdb2e 100644
>> --- a/xen/arch/arm/vgic-v3.c
>> +++ b/xen/arch/arm/vgic-v3.c
>> @@ -170,8 +170,19 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct
>> vcpu *v, mmio_info_t *info,
>>      switch ( gicr_reg )
>>      {
>>      case VREG32(GICR_CTLR):
>> -        /* We have not implemented LPI's, read zero */
>> -        goto read_as_zero_32;
>> +    {
>> +        unsigned long flags;
>> +
>> +        if ( !v->domain->arch.vgic.has_its )
>> +            goto read_as_zero_32;
>> +        if ( dabt.size != DABT_WORD ) goto bad_width;
>> +
>> +        spin_lock_irqsave(&v->arch.vgic.lock, flags);
>> +        *r = vgic_reg32_extract(!!(v->arch.vgic.flags &
>> VGIC_V3_LPIS_ENABLED),
>> +                                info);
>> +        spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
>> +        return 1;
>> +    }
>>
>>      case VREG32(GICR_IIDR):
>>          if ( dabt.size != DABT_WORD ) goto bad_width;
>> @@ -183,16 +194,20 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct
>> vcpu *v, mmio_info_t *info,
>>          uint64_t typer, aff;
>>
>>          if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
>> -        /* TBD: Update processor id in [23:8] when ITS support is
>> added */
>>          aff = (MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 3) << 56 |
>>                 MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 2) << 48 |
>>                 MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 1) << 40 |
>>                 MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 0) << 32);
>>          typer = aff;
>> +        /* We use the VCPU ID as the redistributor ID in bits[23:8] */
>> +        typer |= (v->vcpu_id & 0xffff) << 8;
> 
> Why the mask here? This sound like a bug to me if vcpu_id does not fit
> it and you would make it worst by the mask.
> 
> But this is already addressed by max_vcpus in the vgic_ops. So please
> drop the pointless mask.
> 
> Lastly, I would have expected to try to address my remark everywhere
> regarding hardcoding offset. In this case,

Fixed.

>>
>>          if ( v->arch.vgic.flags & VGIC_V3_RDIST_LAST )
>>              typer |= GICR_TYPER_LAST;
>>
>> +        if ( v->domain->arch.vgic.has_its )
>> +            typer |= GICR_TYPER_PLPIS;
>> +
>>          *r = vgic_reg64_extract(typer, info);
>>
>>          return 1;
>> @@ -426,6 +441,28 @@ static uint64_t sanitize_pendbaser(uint64_t reg)
>>      return reg;
>>  }
>>
>> +static void vgic_vcpu_enable_lpis(struct vcpu *v)
>> +{
>> +    uint64_t reg = v->domain->arch.vgic.rdist_propbase;
>> +    unsigned int nr_lpis = BIT((reg & 0x1f) + 1);
>> +
>> +    /* rdists_enabled is protected by the domain lock. */
>> +    ASSERT(spin_is_locked(&v->domain->arch.vgic.lock));
>> +
>> +    if ( nr_lpis < LPI_OFFSET )
>> +        nr_lpis = 0;
>> +    else
>> +        nr_lpis -= LPI_OFFSET;
>> +
>> +    if ( !v->domain->arch.vgic.rdists_enabled )
>> +    {
>> +        v->domain->arch.vgic.nr_lpis = nr_lpis;
>> +        v->domain->arch.vgic.rdists_enabled = true;
>> +    }
>> +
>> +    v->arch.vgic.flags |= VGIC_V3_LPIS_ENABLED;
>> +}
>> +
>>  static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t
>> *info,
>>                                            uint32_t gicr_reg,
>>                                            register_t r)
>> @@ -436,8 +473,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct
>> vcpu *v, mmio_info_t *info,
>>      switch ( gicr_reg )
>>      {
>>      case VREG32(GICR_CTLR):
>> -        /* LPI's not implemented */
>> -        goto write_ignore_32;
>> +    {
>> +        unsigned long flags;
>> +
>> +        if ( !v->domain->arch.vgic.has_its )
>> +            goto write_ignore_32;
>> +        if ( dabt.size != DABT_WORD ) goto bad_width;
>> +
>> +        vgic_lock(v);                   /* protects rdists_enabled */
> 
> Getting back to the locking. I don't see any place where we get the
> domain vgic lock before vCPU vgic lock.

Because that seems to be the natural locking order, given that one
domain can have multiple VCPUs: We take the domain lock first, then the
VCPU lock. This *seems* to be documented in
xen/include/asm-arm/domain.h, where it says in a comment next to the
domain lock:
================
* If both class of lock is required then this lock must be
* taken first. ....
================

> So this raises the question why
> this ordering and not moving this lock into vgic_vcpu_enable_lpis.

Do you see any issues with that?

> At least this require documentation in the code and explanation in the
> commit message.

In this case I would try to comment on this, but would refrain from
proper locking order documentation (where?) until the rework.

>> +        spin_lock_irqsave(&v->arch.vgic.lock, flags);
>> +
>> +        /* LPIs can only be enabled once, but never disabled again. */
>> +        if ( (r & GICR_CTLR_ENABLE_LPIS) &&
>> +             !(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED) )
>> +            vgic_vcpu_enable_lpis(v);
>> +
>> +        spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
>> +        vgic_unlock(v);
>> +
>> +        return 1;
>> +    }
>>
>>      case VREG32(GICR_IIDR):
>>          /* RO */
>> @@ -1058,6 +1113,11 @@ static int vgic_v3_distr_mmio_read(struct vcpu
>> *v, mmio_info_t *info,
>>          typer = ((ncpus - 1) << GICD_TYPE_CPUS_SHIFT |
>>                   DIV_ROUND_UP(v->domain->arch.vgic.nr_spis, 32));
>>
>> +        if ( v->domain->arch.vgic.has_its )
>> +        {
>> +            typer |= GICD_TYPE_LPIS;
>> +            irq_bits = v->domain->arch.vgic.intid_bits;
>> +        }
> 
> As I said on the previous version, I would have expected the field
> intid_bits to be used even if ITS is not enabled.
> 
> The current code make very difficult to understand the purpose of
> intid_bits and know it is only used when ITS is enabled.
> 
> intid_bits should correctly be initialized in vgic_v3_domain_init and
> directly used it.

OK, I changed this, removed the wrong irq_bits assignment above (this
number is independent from the number of actually implemented SPIs) and
always putting through the hardware value for Dom0 and "10" for DomUs
(to cover all SPIs, we don't need more right now for guests. And yes, I
added a TODO there as well ;-)

Cheers,
Andre.

> 
>>          typer |= (irq_bits - 1) << GICD_TYPE_ID_BITS_SHIFT;
>>
>>          *r = vgic_reg32_extract(typer, info);
>>
> 
> Cheers,
>

Stefano Stabellini May 23, 2017, 5:47 p.m. UTC | #5

On Tue, 23 May 2017, Julien Grall wrote:
> Hi Stefano,
> 
> On 22/05/17 23:19, Stefano Stabellini wrote:
> > On Tue, 16 May 2017, Julien Grall wrote:
> > > > @@ -436,8 +473,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct
> > > > vcpu
> > > > *v, mmio_info_t *info,
> > > >      switch ( gicr_reg )
> > > >      {
> > > >      case VREG32(GICR_CTLR):
> > > > -        /* LPI's not implemented */
> > > > -        goto write_ignore_32;
> > > > +    {
> > > > +        unsigned long flags;
> > > > +
> > > > +        if ( !v->domain->arch.vgic.has_its )
> > > > +            goto write_ignore_32;
> > > > +        if ( dabt.size != DABT_WORD ) goto bad_width;
> > > > +
> > > > +        vgic_lock(v);                   /* protects rdists_enabled */
> > > 
> > > Getting back to the locking. I don't see any place where we get the domain
> > > vgic lock before vCPU vgic lock. So this raises the question why this
> > > ordering
> > > and not moving this lock into vgic_vcpu_enable_lpis.
> > > 
> > > At least this require documentation in the code and explanation in the
> > > commit
> > > message.
> > 
> > It doesn't look like we need to take the v->arch.vgic.lock here. What is
> > it protecting?
> 
> The name of the function is a bit confusion. It does not take the vCPU vgic
> lock but the domain vgic lock.
> 
> I believe the vcpu is passed to avoid have v->domain in most of the callers.
> But we should probably rename the function.
> 
> In this case it protects vgic_vcpu_enable_lpis because you can configure the
> number of LPIs per re-distributor but this is a domain wide value. I know the
> spec is confusing on this.

The quoting here is very unhelpful. In Andre's patch:

@@ -436,8 +473,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
     switch ( gicr_reg )
     {
     case VREG32(GICR_CTLR):
-        /* LPI's not implemented */
-        goto write_ignore_32;
+    {
+        unsigned long flags;
+
+        if ( !v->domain->arch.vgic.has_its )
+            goto write_ignore_32;
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+
+        vgic_lock(v);                   /* protects rdists_enabled */
+        spin_lock_irqsave(&v->arch.vgic.lock, flags);
+
+        /* LPIs can only be enabled once, but never disabled again. */
+        if ( (r & GICR_CTLR_ENABLE_LPIS) &&
+             !(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED) )
+            vgic_vcpu_enable_lpis(v);
+
+        spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
+        vgic_unlock(v);
+
+        return 1;
+    }

My question is: do we need to take both vgic_lock and v->arch.vgic.lock?
If so, why?

Julien Grall May 24, 2017, 10:10 a.m. UTC | #6

Hi Stefano,

On 05/23/2017 06:47 PM, Stefano Stabellini wrote:
> On Tue, 23 May 2017, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 22/05/17 23:19, Stefano Stabellini wrote:
>>> On Tue, 16 May 2017, Julien Grall wrote:
>>>>> @@ -436,8 +473,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct
>>>>> vcpu
>>>>> *v, mmio_info_t *info,
>>>>>      switch ( gicr_reg )
>>>>>      {
>>>>>      case VREG32(GICR_CTLR):
>>>>> -        /* LPI's not implemented */
>>>>> -        goto write_ignore_32;
>>>>> +    {
>>>>> +        unsigned long flags;
>>>>> +
>>>>> +        if ( !v->domain->arch.vgic.has_its )
>>>>> +            goto write_ignore_32;
>>>>> +        if ( dabt.size != DABT_WORD ) goto bad_width;
>>>>> +
>>>>> +        vgic_lock(v);                   /* protects rdists_enabled */
>>>>
>>>> Getting back to the locking. I don't see any place where we get the domain
>>>> vgic lock before vCPU vgic lock. So this raises the question why this
>>>> ordering
>>>> and not moving this lock into vgic_vcpu_enable_lpis.
>>>>
>>>> At least this require documentation in the code and explanation in the
>>>> commit
>>>> message.
>>>
>>> It doesn't look like we need to take the v->arch.vgic.lock here. What is
>>> it protecting?
>>
>> The name of the function is a bit confusion. It does not take the vCPU vgic
>> lock but the domain vgic lock.
>>
>> I believe the vcpu is passed to avoid have v->domain in most of the callers.
>> But we should probably rename the function.
>>
>> In this case it protects vgic_vcpu_enable_lpis because you can configure the
>> number of LPIs per re-distributor but this is a domain wide value. I know the
>> spec is confusing on this.
>
> The quoting here is very unhelpful. In Andre's patch:

Oh, though my point about vgic_lock naming stands :).

>
> @@ -436,8 +473,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>      switch ( gicr_reg )
>      {
>      case VREG32(GICR_CTLR):
> -        /* LPI's not implemented */
> -        goto write_ignore_32;
> +    {
> +        unsigned long flags;
> +
> +        if ( !v->domain->arch.vgic.has_its )
> +            goto write_ignore_32;
> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> +
> +        vgic_lock(v);                   /* protects rdists_enabled */
> +        spin_lock_irqsave(&v->arch.vgic.lock, flags);
> +
> +        /* LPIs can only be enabled once, but never disabled again. */
> +        if ( (r & GICR_CTLR_ENABLE_LPIS) &&
> +             !(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED) )
> +            vgic_vcpu_enable_lpis(v);
> +
> +        spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
> +        vgic_unlock(v);
> +
> +        return 1;
> +    }
>
> My question is: do we need to take both vgic_lock and v->arch.vgic.lock?
> If so, why?

I will let Andre confirm here.

Cheers,

Andre Przywara May 25, 2017, 6:02 p.m. UTC | #7

Hi,

On 23/05/17 18:47, Stefano Stabellini wrote:
> On Tue, 23 May 2017, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 22/05/17 23:19, Stefano Stabellini wrote:
>>> On Tue, 16 May 2017, Julien Grall wrote:
>>>>> @@ -436,8 +473,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct
>>>>> vcpu
>>>>> *v, mmio_info_t *info,
>>>>>      switch ( gicr_reg )
>>>>>      {
>>>>>      case VREG32(GICR_CTLR):
>>>>> -        /* LPI's not implemented */
>>>>> -        goto write_ignore_32;
>>>>> +    {
>>>>> +        unsigned long flags;
>>>>> +
>>>>> +        if ( !v->domain->arch.vgic.has_its )
>>>>> +            goto write_ignore_32;
>>>>> +        if ( dabt.size != DABT_WORD ) goto bad_width;
>>>>> +
>>>>> +        vgic_lock(v);                   /* protects rdists_enabled */
>>>>
>>>> Getting back to the locking. I don't see any place where we get the domain
>>>> vgic lock before vCPU vgic lock. So this raises the question why this
>>>> ordering
>>>> and not moving this lock into vgic_vcpu_enable_lpis.
>>>>
>>>> At least this require documentation in the code and explanation in the
>>>> commit
>>>> message.
>>>
>>> It doesn't look like we need to take the v->arch.vgic.lock here. What is
>>> it protecting?
>>
>> The name of the function is a bit confusion. It does not take the vCPU vgic
>> lock but the domain vgic lock.
>>
>> I believe the vcpu is passed to avoid have v->domain in most of the callers.
>> But we should probably rename the function.
>>
>> In this case it protects vgic_vcpu_enable_lpis because you can configure the
>> number of LPIs per re-distributor but this is a domain wide value. I know the
>> spec is confusing on this.
> 
> The quoting here is very unhelpful. In Andre's patch:
> 
> @@ -436,8 +473,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>      switch ( gicr_reg )
>      {
>      case VREG32(GICR_CTLR):
> -        /* LPI's not implemented */
> -        goto write_ignore_32;
> +    {
> +        unsigned long flags;
> +
> +        if ( !v->domain->arch.vgic.has_its )
> +            goto write_ignore_32;
> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> +
> +        vgic_lock(v);                   /* protects rdists_enabled */
> +        spin_lock_irqsave(&v->arch.vgic.lock, flags);
> +
> +        /* LPIs can only be enabled once, but never disabled again. */
> +        if ( (r & GICR_CTLR_ENABLE_LPIS) &&
> +             !(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED) )
> +            vgic_vcpu_enable_lpis(v);
> +
> +        spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
> +        vgic_unlock(v);
> +
> +        return 1;
> +    }
> 
> My question is: do we need to take both vgic_lock and v->arch.vgic.lock?

The domain lock (taken by vgic_lock()) protects rdists_enabled. This
variable stores whether at least one redistributor has LPIs enabled. In
this case the property table gets into use and since the table is shared
across all redistributors, we must not change it anymore, even on
another redistributor which has its LPIs still disabled.
So while this looks like this is a per-redistributor (=per-VCPU)
property, it is actually per domain, hence this lock.
The VGIC VCPU lock is then used to naturally protect the enable bit
against multiple VCPUs accessing this register simultaneously - the
redists are MMIO mapped, but not banked, so this is possible.

Does that make sense?

Cheers,
Andre

Stefano Stabellini May 25, 2017, 6:49 p.m. UTC | #8

On Thu, 25 May 2017, Andre Przywara wrote:
> Hi,
> 
> On 23/05/17 18:47, Stefano Stabellini wrote:
> > On Tue, 23 May 2017, Julien Grall wrote:
> >> Hi Stefano,
> >>
> >> On 22/05/17 23:19, Stefano Stabellini wrote:
> >>> On Tue, 16 May 2017, Julien Grall wrote:
> >>>>> @@ -436,8 +473,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct
> >>>>> vcpu
> >>>>> *v, mmio_info_t *info,
> >>>>>      switch ( gicr_reg )
> >>>>>      {
> >>>>>      case VREG32(GICR_CTLR):
> >>>>> -        /* LPI's not implemented */
> >>>>> -        goto write_ignore_32;
> >>>>> +    {
> >>>>> +        unsigned long flags;
> >>>>> +
> >>>>> +        if ( !v->domain->arch.vgic.has_its )
> >>>>> +            goto write_ignore_32;
> >>>>> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> >>>>> +
> >>>>> +        vgic_lock(v);                   /* protects rdists_enabled */
> >>>>
> >>>> Getting back to the locking. I don't see any place where we get the domain
> >>>> vgic lock before vCPU vgic lock. So this raises the question why this
> >>>> ordering
> >>>> and not moving this lock into vgic_vcpu_enable_lpis.
> >>>>
> >>>> At least this require documentation in the code and explanation in the
> >>>> commit
> >>>> message.
> >>>
> >>> It doesn't look like we need to take the v->arch.vgic.lock here. What is
> >>> it protecting?
> >>
> >> The name of the function is a bit confusion. It does not take the vCPU vgic
> >> lock but the domain vgic lock.
> >>
> >> I believe the vcpu is passed to avoid have v->domain in most of the callers.
> >> But we should probably rename the function.
> >>
> >> In this case it protects vgic_vcpu_enable_lpis because you can configure the
> >> number of LPIs per re-distributor but this is a domain wide value. I know the
> >> spec is confusing on this.
> > 
> > The quoting here is very unhelpful. In Andre's patch:
> > 
> > @@ -436,8 +473,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
> >      switch ( gicr_reg )
> >      {
> >      case VREG32(GICR_CTLR):
> > -        /* LPI's not implemented */
> > -        goto write_ignore_32;
> > +    {
> > +        unsigned long flags;
> > +
> > +        if ( !v->domain->arch.vgic.has_its )
> > +            goto write_ignore_32;
> > +        if ( dabt.size != DABT_WORD ) goto bad_width;
> > +
> > +        vgic_lock(v);                   /* protects rdists_enabled */
> > +        spin_lock_irqsave(&v->arch.vgic.lock, flags);
> > +
> > +        /* LPIs can only be enabled once, but never disabled again. */
> > +        if ( (r & GICR_CTLR_ENABLE_LPIS) &&
> > +             !(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED) )
> > +            vgic_vcpu_enable_lpis(v);
> > +
> > +        spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
> > +        vgic_unlock(v);
> > +
> > +        return 1;
> > +    }
> > 
> > My question is: do we need to take both vgic_lock and v->arch.vgic.lock?
> 
> The domain lock (taken by vgic_lock()) protects rdists_enabled. This
> variable stores whether at least one redistributor has LPIs enabled. In
> this case the property table gets into use and since the table is shared
> across all redistributors, we must not change it anymore, even on
> another redistributor which has its LPIs still disabled.
> So while this looks like this is a per-redistributor (=per-VCPU)
> property, it is actually per domain, hence this lock.
> The VGIC VCPU lock is then used to naturally protect the enable bit
> against multiple VCPUs accessing this register simultaneously - the
> redists are MMIO mapped, but not banked, so this is possible.
> 
> Does that make sense?

If the VGIC VCPU lock is only used to protect VGIC_V3_LPIS_ENABLED,
couldn't we just read/write the bit atomically? It's just a bit after
all, it doesn't need a lock.

Julien Grall May 25, 2017, 8:07 p.m. UTC | #9

Hi Stefano,

On 25/05/2017 19:49, Stefano Stabellini wrote:
> On Thu, 25 May 2017, Andre Przywara wrote:
>> Hi,
>>
>> On 23/05/17 18:47, Stefano Stabellini wrote:
>>> On Tue, 23 May 2017, Julien Grall wrote:
>>>> Hi Stefano,
>>>>
>>>> On 22/05/17 23:19, Stefano Stabellini wrote:
>>>>> On Tue, 16 May 2017, Julien Grall wrote:
>>>>>>> @@ -436,8 +473,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct
>>>>>>> vcpu
>>>>>>> *v, mmio_info_t *info,
>>>>>>>      switch ( gicr_reg )
>>>>>>>      {
>>>>>>>      case VREG32(GICR_CTLR):
>>>>>>> -        /* LPI's not implemented */
>>>>>>> -        goto write_ignore_32;
>>>>>>> +    {
>>>>>>> +        unsigned long flags;
>>>>>>> +
>>>>>>> +        if ( !v->domain->arch.vgic.has_its )
>>>>>>> +            goto write_ignore_32;
>>>>>>> +        if ( dabt.size != DABT_WORD ) goto bad_width;
>>>>>>> +
>>>>>>> +        vgic_lock(v);                   /* protects rdists_enabled */
>>>>>>
>>>>>> Getting back to the locking. I don't see any place where we get the domain
>>>>>> vgic lock before vCPU vgic lock. So this raises the question why this
>>>>>> ordering
>>>>>> and not moving this lock into vgic_vcpu_enable_lpis.
>>>>>>
>>>>>> At least this require documentation in the code and explanation in the
>>>>>> commit
>>>>>> message.
>>>>>
>>>>> It doesn't look like we need to take the v->arch.vgic.lock here. What is
>>>>> it protecting?
>>>>
>>>> The name of the function is a bit confusion. It does not take the vCPU vgic
>>>> lock but the domain vgic lock.
>>>>
>>>> I believe the vcpu is passed to avoid have v->domain in most of the callers.
>>>> But we should probably rename the function.
>>>>
>>>> In this case it protects vgic_vcpu_enable_lpis because you can configure the
>>>> number of LPIs per re-distributor but this is a domain wide value. I know the
>>>> spec is confusing on this.
>>>
>>> The quoting here is very unhelpful. In Andre's patch:
>>>
>>> @@ -436,8 +473,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>>>      switch ( gicr_reg )
>>>      {
>>>      case VREG32(GICR_CTLR):
>>> -        /* LPI's not implemented */
>>> -        goto write_ignore_32;
>>> +    {
>>> +        unsigned long flags;
>>> +
>>> +        if ( !v->domain->arch.vgic.has_its )
>>> +            goto write_ignore_32;
>>> +        if ( dabt.size != DABT_WORD ) goto bad_width;
>>> +
>>> +        vgic_lock(v);                   /* protects rdists_enabled */
>>> +        spin_lock_irqsave(&v->arch.vgic.lock, flags);
>>> +
>>> +        /* LPIs can only be enabled once, but never disabled again. */
>>> +        if ( (r & GICR_CTLR_ENABLE_LPIS) &&
>>> +             !(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED) )
>>> +            vgic_vcpu_enable_lpis(v);
>>> +
>>> +        spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
>>> +        vgic_unlock(v);
>>> +
>>> +        return 1;
>>> +    }
>>>
>>> My question is: do we need to take both vgic_lock and v->arch.vgic.lock?
>>
>> The domain lock (taken by vgic_lock()) protects rdists_enabled. This
>> variable stores whether at least one redistributor has LPIs enabled. In
>> this case the property table gets into use and since the table is shared
>> across all redistributors, we must not change it anymore, even on
>> another redistributor which has its LPIs still disabled.
>> So while this looks like this is a per-redistributor (=per-VCPU)
>> property, it is actually per domain, hence this lock.
>> The VGIC VCPU lock is then used to naturally protect the enable bit
>> against multiple VCPUs accessing this register simultaneously - the
>> redists are MMIO mapped, but not banked, so this is possible.
>>
>> Does that make sense?
>
> If the VGIC VCPU lock is only used to protect VGIC_V3_LPIS_ENABLED,
> couldn't we just read/write the bit atomically? It's just a bit after
> all, it doesn't need a lock.

The vGIC vCPU lock is also here to serialize access to the 
re-distributor state when necessary.

For instance you don't want to allow write in PENDBASER after LPIs have 
been enabled.

If you don't take the lock here, you would have a small race where 
PENDBASER might be written whilst the LPIs are getting enabled.

The code in PENDBASER today does not strictly require the locking, but I 
think we should keep the lock around. Moving to the atomic will not 
really benefit here as write to those registers will be very unlikely so 
we don't need very good performance.

Cheers,

Stefano Stabellini May 25, 2017, 9:05 p.m. UTC | #10

On Thu, 25 May 2017, Julien Grall wrote:
> Hi Stefano,
> 
> On 25/05/2017 19:49, Stefano Stabellini wrote:
> > On Thu, 25 May 2017, Andre Przywara wrote:
> > > Hi,
> > > 
> > > On 23/05/17 18:47, Stefano Stabellini wrote:
> > > > On Tue, 23 May 2017, Julien Grall wrote:
> > > > > Hi Stefano,
> > > > > 
> > > > > On 22/05/17 23:19, Stefano Stabellini wrote:
> > > > > > On Tue, 16 May 2017, Julien Grall wrote:
> > > > > > > > @@ -436,8 +473,26 @@ static int
> > > > > > > > __vgic_v3_rdistr_rd_mmio_write(struct
> > > > > > > > vcpu
> > > > > > > > *v, mmio_info_t *info,
> > > > > > > >      switch ( gicr_reg )
> > > > > > > >      {
> > > > > > > >      case VREG32(GICR_CTLR):
> > > > > > > > -        /* LPI's not implemented */
> > > > > > > > -        goto write_ignore_32;
> > > > > > > > +    {
> > > > > > > > +        unsigned long flags;
> > > > > > > > +
> > > > > > > > +        if ( !v->domain->arch.vgic.has_its )
> > > > > > > > +            goto write_ignore_32;
> > > > > > > > +        if ( dabt.size != DABT_WORD ) goto bad_width;
> > > > > > > > +
> > > > > > > > +        vgic_lock(v);                   /* protects
> > > > > > > > rdists_enabled */
> > > > > > > 
> > > > > > > Getting back to the locking. I don't see any place where we get
> > > > > > > the domain
> > > > > > > vgic lock before vCPU vgic lock. So this raises the question why
> > > > > > > this
> > > > > > > ordering
> > > > > > > and not moving this lock into vgic_vcpu_enable_lpis.
> > > > > > > 
> > > > > > > At least this require documentation in the code and explanation in
> > > > > > > the
> > > > > > > commit
> > > > > > > message.
> > > > > > 
> > > > > > It doesn't look like we need to take the v->arch.vgic.lock here.
> > > > > > What is
> > > > > > it protecting?
> > > > > 
> > > > > The name of the function is a bit confusion. It does not take the vCPU
> > > > > vgic
> > > > > lock but the domain vgic lock.
> > > > > 
> > > > > I believe the vcpu is passed to avoid have v->domain in most of the
> > > > > callers.
> > > > > But we should probably rename the function.
> > > > > 
> > > > > In this case it protects vgic_vcpu_enable_lpis because you can
> > > > > configure the
> > > > > number of LPIs per re-distributor but this is a domain wide value. I
> > > > > know the
> > > > > spec is confusing on this.
> > > > 
> > > > The quoting here is very unhelpful. In Andre's patch:
> > > > 
> > > > @@ -436,8 +473,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct
> > > > vcpu *v, mmio_info_t *info,
> > > >      switch ( gicr_reg )
> > > >      {
> > > >      case VREG32(GICR_CTLR):
> > > > -        /* LPI's not implemented */
> > > > -        goto write_ignore_32;
> > > > +    {
> > > > +        unsigned long flags;
> > > > +
> > > > +        if ( !v->domain->arch.vgic.has_its )
> > > > +            goto write_ignore_32;
> > > > +        if ( dabt.size != DABT_WORD ) goto bad_width;
> > > > +
> > > > +        vgic_lock(v);                   /* protects rdists_enabled */
> > > > +        spin_lock_irqsave(&v->arch.vgic.lock, flags);
> > > > +
> > > > +        /* LPIs can only be enabled once, but never disabled again. */
> > > > +        if ( (r & GICR_CTLR_ENABLE_LPIS) &&
> > > > +             !(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED) )
> > > > +            vgic_vcpu_enable_lpis(v);
> > > > +
> > > > +        spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
> > > > +        vgic_unlock(v);
> > > > +
> > > > +        return 1;
> > > > +    }
> > > > 
> > > > My question is: do we need to take both vgic_lock and v->arch.vgic.lock?
> > > 
> > > The domain lock (taken by vgic_lock()) protects rdists_enabled. This
> > > variable stores whether at least one redistributor has LPIs enabled. In
> > > this case the property table gets into use and since the table is shared
> > > across all redistributors, we must not change it anymore, even on
> > > another redistributor which has its LPIs still disabled.
> > > So while this looks like this is a per-redistributor (=per-VCPU)
> > > property, it is actually per domain, hence this lock.
> > > The VGIC VCPU lock is then used to naturally protect the enable bit
> > > against multiple VCPUs accessing this register simultaneously - the
> > > redists are MMIO mapped, but not banked, so this is possible.
> > > 
> > > Does that make sense?
> > 
> > If the VGIC VCPU lock is only used to protect VGIC_V3_LPIS_ENABLED,
> > couldn't we just read/write the bit atomically? It's just a bit after
> > all, it doesn't need a lock.
> 
> The vGIC vCPU lock is also here to serialize access to the re-distributor
> state when necessary.
> 
> For instance you don't want to allow write in PENDBASER after LPIs have been
> enabled.
> 
> If you don't take the lock here, you would have a small race where PENDBASER
> might be written whilst the LPIs are getting enabled.
> 
> The code in PENDBASER today does not strictly require the locking, but I think
> we should keep the lock around. Moving to the atomic will not really benefit
> here as write to those registers will be very unlikely so we don't need very
> good performance.

I suggested the atomic as a way to replace the lock, to reduce the
number of lock order dependencies, rather than for performance (who
cares about performance for this case). If all accesses to
VGIC_V3_LPIS_ENABLED are atomic, then we wouldn't need the lock. 

Another maybe simpler way to keep the vgic vcpu lock but avoid
introducing the vgic domain lock -> vgic vcpu lock dependency (the less
the better) would be to take the vgic vcpu lock first, release it, then
take the vgic domain lock and call vgic_vcpu_enable_lpis after.  In
pseudo-code:

    vgic vcpu lock
    read old value of VGIC_V3_LPIS_ENABLED
    write new value of VGIC_V3_LPIS_ENABLED
    vgic vcpu unlock

    vgic domain lock
    vgic_vcpu_enable_lpis (minus the setting of arch.vgic.flags)
    vgic domain unlock

It doesn't look like we need to set VGIC_V3_LPIS_ENABLED within
vgic_vcpu_enable_lpis, so this seems to be working. What do you think?

Julien Grall May 26, 2017, 10:19 a.m. UTC | #11

Hi Stefano,

On 25/05/17 22:05, Stefano Stabellini wrote:
> On Thu, 25 May 2017, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 25/05/2017 19:49, Stefano Stabellini wrote:
>>> On Thu, 25 May 2017, Andre Przywara wrote:
>>>> Hi,
>>>>
>>>> On 23/05/17 18:47, Stefano Stabellini wrote:
>>>>> On Tue, 23 May 2017, Julien Grall wrote:
>>>>>> Hi Stefano,
>>>>>>
>>>>>> On 22/05/17 23:19, Stefano Stabellini wrote:
>>>>>>> On Tue, 16 May 2017, Julien Grall wrote:
>>>>>>>>> @@ -436,8 +473,26 @@ static int
>>>>>>>>> __vgic_v3_rdistr_rd_mmio_write(struct
>>>>>>>>> vcpu
>>>>>>>>> *v, mmio_info_t *info,
>>>>>>>>>      switch ( gicr_reg )
>>>>>>>>>      {
>>>>>>>>>      case VREG32(GICR_CTLR):
>>>>>>>>> -        /* LPI's not implemented */
>>>>>>>>> -        goto write_ignore_32;
>>>>>>>>> +    {
>>>>>>>>> +        unsigned long flags;
>>>>>>>>> +
>>>>>>>>> +        if ( !v->domain->arch.vgic.has_its )
>>>>>>>>> +            goto write_ignore_32;
>>>>>>>>> +        if ( dabt.size != DABT_WORD ) goto bad_width;
>>>>>>>>> +
>>>>>>>>> +        vgic_lock(v);                   /* protects
>>>>>>>>> rdists_enabled */
>>>>>>>>
>>>>>>>> Getting back to the locking. I don't see any place where we get
>>>>>>>> the domain
>>>>>>>> vgic lock before vCPU vgic lock. So this raises the question why
>>>>>>>> this
>>>>>>>> ordering
>>>>>>>> and not moving this lock into vgic_vcpu_enable_lpis.
>>>>>>>>
>>>>>>>> At least this require documentation in the code and explanation in
>>>>>>>> the
>>>>>>>> commit
>>>>>>>> message.
>>>>>>>
>>>>>>> It doesn't look like we need to take the v->arch.vgic.lock here.
>>>>>>> What is
>>>>>>> it protecting?
>>>>>>
>>>>>> The name of the function is a bit confusion. It does not take the vCPU
>>>>>> vgic
>>>>>> lock but the domain vgic lock.
>>>>>>
>>>>>> I believe the vcpu is passed to avoid have v->domain in most of the
>>>>>> callers.
>>>>>> But we should probably rename the function.
>>>>>>
>>>>>> In this case it protects vgic_vcpu_enable_lpis because you can
>>>>>> configure the
>>>>>> number of LPIs per re-distributor but this is a domain wide value. I
>>>>>> know the
>>>>>> spec is confusing on this.
>>>>>
>>>>> The quoting here is very unhelpful. In Andre's patch:
>>>>>
>>>>> @@ -436,8 +473,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct
>>>>> vcpu *v, mmio_info_t *info,
>>>>>      switch ( gicr_reg )
>>>>>      {
>>>>>      case VREG32(GICR_CTLR):
>>>>> -        /* LPI's not implemented */
>>>>> -        goto write_ignore_32;
>>>>> +    {
>>>>> +        unsigned long flags;
>>>>> +
>>>>> +        if ( !v->domain->arch.vgic.has_its )
>>>>> +            goto write_ignore_32;
>>>>> +        if ( dabt.size != DABT_WORD ) goto bad_width;
>>>>> +
>>>>> +        vgic_lock(v);                   /* protects rdists_enabled */
>>>>> +        spin_lock_irqsave(&v->arch.vgic.lock, flags);
>>>>> +
>>>>> +        /* LPIs can only be enabled once, but never disabled again. */
>>>>> +        if ( (r & GICR_CTLR_ENABLE_LPIS) &&
>>>>> +             !(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED) )
>>>>> +            vgic_vcpu_enable_lpis(v);
>>>>> +
>>>>> +        spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
>>>>> +        vgic_unlock(v);
>>>>> +
>>>>> +        return 1;
>>>>> +    }
>>>>>
>>>>> My question is: do we need to take both vgic_lock and v->arch.vgic.lock?
>>>>
>>>> The domain lock (taken by vgic_lock()) protects rdists_enabled. This
>>>> variable stores whether at least one redistributor has LPIs enabled. In
>>>> this case the property table gets into use and since the table is shared
>>>> across all redistributors, we must not change it anymore, even on
>>>> another redistributor which has its LPIs still disabled.
>>>> So while this looks like this is a per-redistributor (=per-VCPU)
>>>> property, it is actually per domain, hence this lock.
>>>> The VGIC VCPU lock is then used to naturally protect the enable bit
>>>> against multiple VCPUs accessing this register simultaneously - the
>>>> redists are MMIO mapped, but not banked, so this is possible.
>>>>
>>>> Does that make sense?
>>>
>>> If the VGIC VCPU lock is only used to protect VGIC_V3_LPIS_ENABLED,
>>> couldn't we just read/write the bit atomically? It's just a bit after
>>> all, it doesn't need a lock.
>>
>> The vGIC vCPU lock is also here to serialize access to the re-distributor
>> state when necessary.
>>
>> For instance you don't want to allow write in PENDBASER after LPIs have been
>> enabled.
>>
>> If you don't take the lock here, you would have a small race where PENDBASER
>> might be written whilst the LPIs are getting enabled.
>>
>> The code in PENDBASER today does not strictly require the locking, but I think
>> we should keep the lock around. Moving to the atomic will not really benefit
>> here as write to those registers will be very unlikely so we don't need very
>> good performance.
>
> I suggested the atomic as a way to replace the lock, to reduce the
> number of lock order dependencies, rather than for performance (who
> cares about performance for this case). If all accesses to
> VGIC_V3_LPIS_ENABLED are atomic, then we wouldn't need the lock.
>
> Another maybe simpler way to keep the vgic vcpu lock but avoid
> introducing the vgic domain lock -> vgic vcpu lock dependency (the less
> the better) would be to take the vgic vcpu lock first, release it, then
> take the vgic domain lock and call vgic_vcpu_enable_lpis after.  In
> pseudo-code:
>
>     vgic vcpu lock
>     read old value of VGIC_V3_LPIS_ENABLED
>     write new value of VGIC_V3_LPIS_ENABLED
>     vgic vcpu unlock
>
>     vgic domain lock
>     vgic_vcpu_enable_lpis (minus the setting of arch.vgic.flags)
>     vgic domain unlock
>
> It doesn't look like we need to set VGIC_V3_LPIS_ENABLED within
> vgic_vcpu_enable_lpis, so this seems to be working. What do you think?

 From a point of view of the vGIC you want to enable 
VGIC_V3_LPIS_ENABLED after all the sanity checks have been done.

I would have expected the ITS to check if the redistributor has been 
enabled before enabling it (see vgic_v3_verify_its_status). This is 
because the ITS is using the priority table and also the number of LPIs.

So you effectively want to have VGIC_V3_LPIS_ENABLED set after in 
vgic_vcpu_enable_lpis to avoid potential race condition. You may also 
want to have a mb() before writing to it so you can using 
VGIC_V3_LPIS_ENABLED without any lock safely.

Andre, can you explain why the ITS does not check whether the 
rdists_enabled are enabled?

Cheers,

Andre Przywara May 26, 2017, 5:12 p.m. UTC | #12

Hi,

On 26/05/17 11:19, Julien Grall wrote:
> Hi Stefano,
> 
> On 25/05/17 22:05, Stefano Stabellini wrote:
>> On Thu, 25 May 2017, Julien Grall wrote:
>>> Hi Stefano,
>>>
>>> On 25/05/2017 19:49, Stefano Stabellini wrote:
>>>> On Thu, 25 May 2017, Andre Przywara wrote:
>>>>> Hi,
>>>>>
>>>>> On 23/05/17 18:47, Stefano Stabellini wrote:
>>>>>> On Tue, 23 May 2017, Julien Grall wrote:
>>>>>>> Hi Stefano,
>>>>>>>
>>>>>>> On 22/05/17 23:19, Stefano Stabellini wrote:
>>>>>>>> On Tue, 16 May 2017, Julien Grall wrote:
>>>>>>>>>> @@ -436,8 +473,26 @@ static int
>>>>>>>>>> __vgic_v3_rdistr_rd_mmio_write(struct
>>>>>>>>>> vcpu
>>>>>>>>>> *v, mmio_info_t *info,
>>>>>>>>>>      switch ( gicr_reg )
>>>>>>>>>>      {
>>>>>>>>>>      case VREG32(GICR_CTLR):
>>>>>>>>>> -        /* LPI's not implemented */
>>>>>>>>>> -        goto write_ignore_32;
>>>>>>>>>> +    {
>>>>>>>>>> +        unsigned long flags;
>>>>>>>>>> +
>>>>>>>>>> +        if ( !v->domain->arch.vgic.has_its )
>>>>>>>>>> +            goto write_ignore_32;
>>>>>>>>>> +        if ( dabt.size != DABT_WORD ) goto bad_width;
>>>>>>>>>> +
>>>>>>>>>> +        vgic_lock(v);                   /* protects
>>>>>>>>>> rdists_enabled */
>>>>>>>>>
>>>>>>>>> Getting back to the locking. I don't see any place where we get
>>>>>>>>> the domain
>>>>>>>>> vgic lock before vCPU vgic lock. So this raises the question why
>>>>>>>>> this
>>>>>>>>> ordering
>>>>>>>>> and not moving this lock into vgic_vcpu_enable_lpis.
>>>>>>>>>
>>>>>>>>> At least this require documentation in the code and explanation in
>>>>>>>>> the
>>>>>>>>> commit
>>>>>>>>> message.
>>>>>>>>
>>>>>>>> It doesn't look like we need to take the v->arch.vgic.lock here.
>>>>>>>> What is
>>>>>>>> it protecting?
>>>>>>>
>>>>>>> The name of the function is a bit confusion. It does not take the
>>>>>>> vCPU
>>>>>>> vgic
>>>>>>> lock but the domain vgic lock.
>>>>>>>
>>>>>>> I believe the vcpu is passed to avoid have v->domain in most of the
>>>>>>> callers.
>>>>>>> But we should probably rename the function.
>>>>>>>
>>>>>>> In this case it protects vgic_vcpu_enable_lpis because you can
>>>>>>> configure the
>>>>>>> number of LPIs per re-distributor but this is a domain wide value. I
>>>>>>> know the
>>>>>>> spec is confusing on this.
>>>>>>
>>>>>> The quoting here is very unhelpful. In Andre's patch:
>>>>>>
>>>>>> @@ -436,8 +473,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct
>>>>>> vcpu *v, mmio_info_t *info,
>>>>>>      switch ( gicr_reg )
>>>>>>      {
>>>>>>      case VREG32(GICR_CTLR):
>>>>>> -        /* LPI's not implemented */
>>>>>> -        goto write_ignore_32;
>>>>>> +    {
>>>>>> +        unsigned long flags;
>>>>>> +
>>>>>> +        if ( !v->domain->arch.vgic.has_its )
>>>>>> +            goto write_ignore_32;
>>>>>> +        if ( dabt.size != DABT_WORD ) goto bad_width;
>>>>>> +
>>>>>> +        vgic_lock(v);                   /* protects
>>>>>> rdists_enabled */
>>>>>> +        spin_lock_irqsave(&v->arch.vgic.lock, flags);
>>>>>> +
>>>>>> +        /* LPIs can only be enabled once, but never disabled
>>>>>> again. */
>>>>>> +        if ( (r & GICR_CTLR_ENABLE_LPIS) &&
>>>>>> +             !(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED) )
>>>>>> +            vgic_vcpu_enable_lpis(v);
>>>>>> +
>>>>>> +        spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
>>>>>> +        vgic_unlock(v);
>>>>>> +
>>>>>> +        return 1;
>>>>>> +    }
>>>>>>
>>>>>> My question is: do we need to take both vgic_lock and
>>>>>> v->arch.vgic.lock?
>>>>>
>>>>> The domain lock (taken by vgic_lock()) protects rdists_enabled. This
>>>>> variable stores whether at least one redistributor has LPIs
>>>>> enabled. In
>>>>> this case the property table gets into use and since the table is
>>>>> shared
>>>>> across all redistributors, we must not change it anymore, even on
>>>>> another redistributor which has its LPIs still disabled.
>>>>> So while this looks like this is a per-redistributor (=per-VCPU)
>>>>> property, it is actually per domain, hence this lock.
>>>>> The VGIC VCPU lock is then used to naturally protect the enable bit
>>>>> against multiple VCPUs accessing this register simultaneously - the
>>>>> redists are MMIO mapped, but not banked, so this is possible.
>>>>>
>>>>> Does that make sense?
>>>>
>>>> If the VGIC VCPU lock is only used to protect VGIC_V3_LPIS_ENABLED,
>>>> couldn't we just read/write the bit atomically? It's just a bit after
>>>> all, it doesn't need a lock.
>>>
>>> The vGIC vCPU lock is also here to serialize access to the
>>> re-distributor
>>> state when necessary.
>>>
>>> For instance you don't want to allow write in PENDBASER after LPIs
>>> have been
>>> enabled.
>>>
>>> If you don't take the lock here, you would have a small race where
>>> PENDBASER
>>> might be written whilst the LPIs are getting enabled.
>>>
>>> The code in PENDBASER today does not strictly require the locking,
>>> but I think
>>> we should keep the lock around. Moving to the atomic will not really
>>> benefit
>>> here as write to those registers will be very unlikely so we don't
>>> need very
>>> good performance.
>>
>> I suggested the atomic as a way to replace the lock, to reduce the
>> number of lock order dependencies, rather than for performance (who
>> cares about performance for this case). If all accesses to
>> VGIC_V3_LPIS_ENABLED are atomic, then we wouldn't need the lock.
>>
>> Another maybe simpler way to keep the vgic vcpu lock but avoid
>> introducing the vgic domain lock -> vgic vcpu lock dependency (the less
>> the better) would be to take the vgic vcpu lock first, release it, then
>> take the vgic domain lock and call vgic_vcpu_enable_lpis after.  In
>> pseudo-code:
>>
>>     vgic vcpu lock
>>     read old value of VGIC_V3_LPIS_ENABLED
>>     write new value of VGIC_V3_LPIS_ENABLED
>>     vgic vcpu unlock
>>
>>     vgic domain lock
>>     vgic_vcpu_enable_lpis (minus the setting of arch.vgic.flags)
>>     vgic domain unlock
>>
>> It doesn't look like we need to set VGIC_V3_LPIS_ENABLED within
>> vgic_vcpu_enable_lpis, so this seems to be working. What do you think?
> 
> From a point of view of the vGIC you want to enable VGIC_V3_LPIS_ENABLED
> after all the sanity checks have been done.
> 
> I would have expected the ITS to check if the redistributor has been
> enabled before enabling it (see vgic_v3_verify_its_status). This is
> because the ITS is using the priority table and also the number of LPIs.
> 
> So you effectively want to have VGIC_V3_LPIS_ENABLED set after in
> vgic_vcpu_enable_lpis to avoid potential race condition. You may also
> want to have a mb() before writing to it so you can using
> VGIC_V3_LPIS_ENABLED without any lock safely.

Right, I added an smp_mb() after the rdists_enabled write to make sure
this is in sync.

> Andre, can you explain why the ITS does not check whether the
> rdists_enabled are enabled?

So architecturally it's not required to have LPIs enabled, and from a
spec point of view the ITS does not care about the LPI's properties.
We check for LPIs being enabled on that redistributor before injecting LPI.

But I think you are right that our implementation is a bit sloppy with
the separation between LPIs and the ITS, and reads the property table
while handling commands - because we only keep track of mapped LPIs.
So I added a check now in update_lpi_properties() to bail out (without
an error) if no redistributor has LPIs enabled yet. That should solve
that corner case.

Cheers,
Andre.

[v9,12/28] ARM: vGIC: advertise LPI support

Commit Message

Comments

Patch