[v5,13/30] ARM: GICv3: forward pending LPIs to guests

Message ID	1491434362-30310-14-git-send-email-andre.przywara@arm.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <xen-devel-bounces@lists.xen.org> From: Andre Przywara <andre.przywara@arm.com> To: Stefano Stabellini <sstabellini@kernel.org>, Julien Grall <julien.grall@arm.com> Date: Thu, 6 Apr 2017 00:19:05 +0100 Message-Id: <1491434362-30310-14-git-send-email-andre.przywara@arm.com> In-Reply-To: <1491434362-30310-1-git-send-email-andre.przywara@arm.com> References: <1491434362-30310-1-git-send-email-andre.przywara@arm.com> Cc: xen-devel@lists.xenproject.org, Shanker Donthineni <shankerd@codeaurora.org>, Vijay Kilari <vijay.kilari@gmail.com> Subject: [Xen-devel] [PATCH v5 13/30] ARM: GICv3: forward pending LPIs to guests Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" <xen-devel-bounces@lists.xen.org>

Andre Przywara April 5, 2017, 11:19 p.m. UTC

Upon receiving an LPI on the host, we need to find the right VCPU and
virtual IRQ number to get this IRQ injected.
Iterate our two-level LPI table to find this information quickly when
the host takes an LPI. Call the existing injection function to let the
GIC emulation deal with this interrupt.
Also we enhance struct pending_irq to cache the pending bit and the
priority information for LPIs, as we can't afford to walk the tables in
guest memory every time we handle an incoming LPI.
This introduces a do_LPI() as a hardware gic_ops and a function to
retrieve the (cached) priority value of an LPI and a vgic_ops.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v2.c            |  7 +++++
 xen/arch/arm/gic-v3-lpi.c        | 56 ++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3.c            |  1 +
 xen/arch/arm/gic.c               |  8 +++++-
 xen/arch/arm/vgic-v2.c           |  7 +++++
 xen/arch/arm/vgic-v3.c           | 12 +++++++++
 xen/arch/arm/vgic.c              |  7 ++++-
 xen/include/asm-arm/gic.h        |  2 ++
 xen/include/asm-arm/gic_v3_its.h |  8 ++++++
 xen/include/asm-arm/vgic.h       |  3 +++
 10 files changed, 109 insertions(+), 2 deletions(-)

Stefano Stabellini April 5, 2017, 11:45 p.m. UTC | #1

On Thu, 6 Apr 2017, Andre Przywara wrote:
> Upon receiving an LPI on the host, we need to find the right VCPU and
> virtual IRQ number to get this IRQ injected.
> Iterate our two-level LPI table to find this information quickly when
> the host takes an LPI. Call the existing injection function to let the
> GIC emulation deal with this interrupt.
> Also we enhance struct pending_irq to cache the pending bit and the
> priority information for LPIs, as we can't afford to walk the tables in
> guest memory every time we handle an incoming LPI.
> This introduces a do_LPI() as a hardware gic_ops and a function to
> retrieve the (cached) priority value of an LPI and a vgic_ops.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-v2.c            |  7 +++++
>  xen/arch/arm/gic-v3-lpi.c        | 56 ++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c            |  1 +
>  xen/arch/arm/gic.c               |  8 +++++-
>  xen/arch/arm/vgic-v2.c           |  7 +++++
>  xen/arch/arm/vgic-v3.c           | 12 +++++++++
>  xen/arch/arm/vgic.c              |  7 ++++-
>  xen/include/asm-arm/gic.h        |  2 ++
>  xen/include/asm-arm/gic_v3_its.h |  8 ++++++
>  xen/include/asm-arm/vgic.h       |  3 +++
>  10 files changed, 109 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
> index 270a136..f4d7949 100644
> --- a/xen/arch/arm/gic-v2.c
> +++ b/xen/arch/arm/gic-v2.c
> @@ -1217,6 +1217,12 @@ static int __init gicv2_init(void)
>      return 0;
>  }
>  
> +void gicv2_do_LPI(unsigned int lpi)
> +{
> +    /* No LPIs in a GICv2 */
> +    BUG();
> +}
> +
>  const static struct gic_hw_operations gicv2_ops = {
>      .info                = &gicv2_info,
>      .init                = gicv2_init,
> @@ -1244,6 +1250,7 @@ const static struct gic_hw_operations gicv2_ops = {
>      .make_hwdom_madt     = gicv2_make_hwdom_madt,
>      .map_hwdom_extra_mappings = gicv2_map_hwdown_extra_mappings,
>      .iomem_deny_access   = gicv2_iomem_deny_access,
> +    .do_LPI              = gicv2_do_LPI,
>  };
>  
>  /* Set up the GIC */
> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> index 0785701..d8baebc 100644
> --- a/xen/arch/arm/gic-v3-lpi.c
> +++ b/xen/arch/arm/gic-v3-lpi.c
> @@ -136,6 +136,62 @@ uint64_t gicv3_get_redist_address(unsigned int cpu, bool use_pta)
>          return per_cpu(lpi_redist, cpu).redist_id << 16;
>  }
>  
> +/*
> + * Handle incoming LPIs, which are a bit special, because they are potentially
> + * numerous and also only get injected into guests. Treat them specially here,
> + * by just looking up their target vCPU and virtual LPI number and hand it
> + * over to the injection function.
> + * Please note that LPIs are edge-triggered only, also have no active state,
> + * so spurious interrupts on the host side are no issue (we can just ignore
> + * them).
> + * Also a guest cannot expect that firing interrupts that haven't been
> + * fully configured yet will reach the CPU, so we don't need to care about
> + * this special case.
> + */
> +void gicv3_do_LPI(unsigned int lpi)
> +{
> +    struct domain *d;
> +    union host_lpi *hlpip, hlpi;
> +    struct vcpu *vcpu;
> +
> +    /* EOI the LPI already. */
> +    WRITE_SYSREG32(lpi, ICC_EOIR1_EL1);
> +
> +    /* Find out if a guest mapped something to this physical LPI. */
> +    hlpip = gic_get_host_lpi(lpi);
> +    if ( !hlpip )
> +        return;
> +
> +    hlpi.data = read_u64_atomic(&hlpip->data);
> +
> +    /*
> +     * Unmapped events are marked with an invalid LPI ID. We can safely
> +     * ignore them, as they have no further state and no-one can expect
> +     * to see them if they have not been mapped.
> +     */
> +    if ( hlpi.virt_lpi == INVALID_LPI )
> +        return;
> +
> +    d = rcu_lock_domain_by_id(hlpi.dom_id);
> +    if ( !d )
> +        return;
> +
> +    /* Make sure we don't step beyond the vcpu array. */
> +    if ( hlpi.vcpu_id >= d->max_vcpus )
> +    {
> +        rcu_unlock_domain(d);
> +        return;
> +    }
> +
> +    vcpu = d->vcpu[hlpi.vcpu_id];
> +
> +    /* Check if the VCPU is ready to receive LPIs. */
> +    if ( vcpu->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )
> +        vgic_vcpu_inject_irq(vcpu, hlpi.virt_lpi);
> +
> +    rcu_unlock_domain(d);
> +}
> +
>  static int gicv3_lpi_allocate_pendtable(uint64_t *reg)
>  {
>      uint64_t val;
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index a559e5e..63dbc21 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -1670,6 +1670,7 @@ static const struct gic_hw_operations gicv3_ops = {
>      .make_hwdom_dt_node  = gicv3_make_hwdom_dt_node,
>      .make_hwdom_madt     = gicv3_make_hwdom_madt,
>      .iomem_deny_access   = gicv3_iomem_deny_access,
> +    .do_LPI              = gicv3_do_LPI,
>  };
>  
>  static int __init gicv3_dt_preinit(struct dt_device_node *node, const void *data)
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index 9522c6c..a56be34 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -697,7 +697,13 @@ void gic_interrupt(struct cpu_user_regs *regs, int is_fiq)
>              do_IRQ(regs, irq, is_fiq);
>              local_irq_disable();
>          }
> -        else if (unlikely(irq < 16))
> +        else if ( is_lpi(irq) )
> +        {
> +            local_irq_enable();
> +            gic_hw_ops->do_LPI(irq);
> +            local_irq_disable();
> +        }
> +        else if ( unlikely(irq < 16) )
>          {
>              do_sgi(regs, irq);
>          }
> diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
> index 3cad68f..5f4c5ad 100644
> --- a/xen/arch/arm/vgic-v2.c
> +++ b/xen/arch/arm/vgic-v2.c
> @@ -710,11 +710,18 @@ static struct pending_irq *vgic_v2_lpi_to_pending(struct domain *d,
>      BUG();
>  }
>  
> +static int vgic_v2_lpi_get_priority(struct domain *d, unsigned int vlpi)
> +{
> +    /* Dummy function, no LPIs on a VGICv2. */
> +    BUG();
> +}
> +
>  static const struct vgic_ops vgic_v2_ops = {
>      .vcpu_init   = vgic_v2_vcpu_init,
>      .domain_init = vgic_v2_domain_init,
>      .domain_free = vgic_v2_domain_free,
>      .lpi_to_pending = vgic_v2_lpi_to_pending,
> +    .lpi_get_priority = vgic_v2_lpi_get_priority,
>      .max_vcpus = 8,
>  };
>  
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index 5128f13..2a14305 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -1548,12 +1548,24 @@ struct pending_irq *vgic_v3_lpi_to_pending(struct domain *d, unsigned int lpi)
>      return pirq;
>  }
>  
> +/* Retrieve the priority of an LPI from its struct pending_irq. */
> +int vgic_v3_lpi_get_priority(struct domain *d, uint32_t vlpi)
> +{
> +    struct pending_irq *p = vgic_v3_lpi_to_pending(d, vlpi);
> +
> +    if ( !p )
> +        return GIC_PRI_IRQ;
> +
> +    return p->lpi_priority;
> +}
> +
>  static const struct vgic_ops v3_ops = {
>      .vcpu_init   = vgic_v3_vcpu_init,
>      .domain_init = vgic_v3_domain_init,
>      .domain_free = vgic_v3_domain_free,
>      .emulate_reg  = vgic_v3_emulate_reg,
>      .lpi_to_pending = vgic_v3_lpi_to_pending,
> +    .lpi_get_priority = vgic_v3_lpi_get_priority,
>      /*
>       * We use both AFF1 and AFF0 in (v)MPIDR. Thus, the max number of CPU
>       * that can be supported is up to 4096(==256*16) in theory.
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index d704d7c..cd9a2a5 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -226,10 +226,15 @@ struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq)
>  
>  static int vgic_get_virq_priority(struct vcpu *v, unsigned int virq)
>  {
> -    struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
> +    struct vgic_irq_rank *rank;
>      unsigned long flags;
>      int priority;
>  
> +    /* LPIs don't have a rank, also store their priority separately. */
> +    if ( is_lpi(virq) )
> +        return v->domain->arch.vgic.handler->lpi_get_priority(v->domain, virq);
> +
> +    rank = vgic_rank_irq(v, virq);
>      vgic_lock_rank(v, rank, flags);
>      priority = rank->priority[virq & INTERRUPT_RANK_MASK];
>      vgic_unlock_rank(v, rank, flags);
> diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
> index 836a103..42963c0 100644
> --- a/xen/include/asm-arm/gic.h
> +++ b/xen/include/asm-arm/gic.h
> @@ -366,6 +366,8 @@ struct gic_hw_operations {
>      int (*map_hwdom_extra_mappings)(struct domain *d);
>      /* Deny access to GIC regions */
>      int (*iomem_deny_access)(const struct domain *d);
> +    /* Handle LPIs, which require special handling */
> +    void (*do_LPI)(unsigned int lpi);
>  };
>  
>  void register_gic_ops(const struct gic_hw_operations *ops);
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index 3b7c724..eb71fbd 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -139,6 +139,8 @@ void gicv3_its_dt_init(const struct dt_device_node *node);
>  
>  bool gicv3_its_host_has_its(void);
>  
> +void gicv3_do_LPI(unsigned int lpi);
> +
>  int gicv3_lpi_init_rdist(void __iomem * rdist_base);
>  
>  /* Initialize the host structures for LPIs and the host ITSes. */
> @@ -182,6 +184,12 @@ static inline bool gicv3_its_host_has_its(void)
>      return false;
>  }
>  
> +static inline void gicv3_do_LPI(unsigned int lpi)
> +{
> +    /* We don't enable LPIs without an ITS. */
> +    BUG();
> +}
> +
>  static inline int gicv3_lpi_init_rdist(void __iomem * rdist_base)
>  {
>      return -ENODEV;
> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
> index 7c86f5b..08d6294 100644
> --- a/xen/include/asm-arm/vgic.h
> +++ b/xen/include/asm-arm/vgic.h
> @@ -66,12 +66,14 @@ struct pending_irq
>  #define GIC_IRQ_GUEST_VISIBLE  2
>  #define GIC_IRQ_GUEST_ENABLED  3
>  #define GIC_IRQ_GUEST_MIGRATING   4
> +#define GIC_IRQ_GUEST_LPI_PENDING 5     /* Caches the pending bit of an LPI. */

I don't think we need GIC_IRQ_GUEST_LPI_PENDING, you can reuse
GIC_IRQ_GUEST_QUEUED and list_empty(&n->inflight): if you call
vgic_vcpu_inject_irq passing an irq as argument that is not enabled, it
will get GIC_IRQ_GUEST_QUEUED and added to inflight, but not injected.


>      unsigned long status;
>      struct irq_desc *desc; /* only set it the irq corresponds to a physical irq */
>      unsigned int irq;
>  #define GIC_INVALID_LR         (uint8_t)~0
>      uint8_t lr;
>      uint8_t priority;
> +    uint8_t lpi_priority;       /* Caches the priority if this is an LPI. */

The commit message says: "we enhance struct pending_irq to cache the
pending bit and the priority information for LPIs, as we can't afford to
walk the tables in guest memory every time we handle an incoming LPI." I
thought it would be direct access, having the vlpi number in our hands?
Why is it a problem?

If there has been a conversation about this that I am missing, please
provide a link, I'll go back and read it.


>      /* inflight is used to append instances of pending_irq to
>       * vgic.inflight_irqs */
>      struct list_head inflight;
> @@ -136,6 +138,7 @@ struct vgic_ops {
>      bool (*emulate_reg)(struct cpu_user_regs *regs, union hsr hsr);
>      /* lookup the struct pending_irq for a given LPI interrupt */
>      struct pending_irq *(*lpi_to_pending)(struct domain *d, unsigned int vlpi);
> +    int (*lpi_get_priority)(struct domain *d, uint32_t vlpi);
>      /* Maximum number of vCPU supported */
>      const unsigned int max_vcpus;
>  };
> -- 
> 2.8.2
>

Andre Przywara April 6, 2017, 5:42 p.m. UTC | #2

Hi Stefano,

thanks for spending your brain cells on this nasty code.

...

On 06/04/17 00:45, Stefano Stabellini wrote:
> On Thu, 6 Apr 2017, Andre Przywara wrote:
>> Upon receiving an LPI on the host, we need to find the right VCPU and
>> virtual IRQ number to get this IRQ injected.
>> Iterate our two-level LPI table to find this information quickly when
>> the host takes an LPI. Call the existing injection function to let the
>> GIC emulation deal with this interrupt.
>> Also we enhance struct pending_irq to cache the pending bit and the
>> priority information for LPIs, as we can't afford to walk the tables in
>> guest memory every time we handle an incoming LPI.
>> This introduces a do_LPI() as a hardware gic_ops and a function to
>> retrieve the (cached) priority value of an LPI and a vgic_ops.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/gic-v2.c            |  7 +++++
>>  xen/arch/arm/gic-v3-lpi.c        | 56 ++++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/gic-v3.c            |  1 +
>>  xen/arch/arm/gic.c               |  8 +++++-
>>  xen/arch/arm/vgic-v2.c           |  7 +++++
>>  xen/arch/arm/vgic-v3.c           | 12 +++++++++
>>  xen/arch/arm/vgic.c              |  7 ++++-
>>  xen/include/asm-arm/gic.h        |  2 ++
>>  xen/include/asm-arm/gic_v3_its.h |  8 ++++++
>>  xen/include/asm-arm/vgic.h       |  3 +++
>>  10 files changed, 109 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
>> index 270a136..f4d7949 100644
>> --- a/xen/arch/arm/gic-v2.c
>> +++ b/xen/arch/arm/gic-v2.c
>> @@ -1217,6 +1217,12 @@ static int __init gicv2_init(void)
>>      return 0;
>>  }
>>  
>> +void gicv2_do_LPI(unsigned int lpi)
>> +{
>> +    /* No LPIs in a GICv2 */
>> +    BUG();
>> +}
>> +
>>  const static struct gic_hw_operations gicv2_ops = {
>>      .info                = &gicv2_info,
>>      .init                = gicv2_init,
>> @@ -1244,6 +1250,7 @@ const static struct gic_hw_operations gicv2_ops = {
>>      .make_hwdom_madt     = gicv2_make_hwdom_madt,
>>      .map_hwdom_extra_mappings = gicv2_map_hwdown_extra_mappings,
>>      .iomem_deny_access   = gicv2_iomem_deny_access,
>> +    .do_LPI              = gicv2_do_LPI,
>>  };
>>  
>>  /* Set up the GIC */
>> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
>> index 0785701..d8baebc 100644
>> --- a/xen/arch/arm/gic-v3-lpi.c
>> +++ b/xen/arch/arm/gic-v3-lpi.c
>> @@ -136,6 +136,62 @@ uint64_t gicv3_get_redist_address(unsigned int cpu, bool use_pta)
>>          return per_cpu(lpi_redist, cpu).redist_id << 16;
>>  }
>>  
>> +/*
>> + * Handle incoming LPIs, which are a bit special, because they are potentially
>> + * numerous and also only get injected into guests. Treat them specially here,
>> + * by just looking up their target vCPU and virtual LPI number and hand it
>> + * over to the injection function.
>> + * Please note that LPIs are edge-triggered only, also have no active state,
>> + * so spurious interrupts on the host side are no issue (we can just ignore
>> + * them).
>> + * Also a guest cannot expect that firing interrupts that haven't been
>> + * fully configured yet will reach the CPU, so we don't need to care about
>> + * this special case.
>> + */
>> +void gicv3_do_LPI(unsigned int lpi)
>> +{
>> +    struct domain *d;
>> +    union host_lpi *hlpip, hlpi;
>> +    struct vcpu *vcpu;
>> +
>> +    /* EOI the LPI already. */
>> +    WRITE_SYSREG32(lpi, ICC_EOIR1_EL1);
>> +
>> +    /* Find out if a guest mapped something to this physical LPI. */
>> +    hlpip = gic_get_host_lpi(lpi);
>> +    if ( !hlpip )
>> +        return;
>> +
>> +    hlpi.data = read_u64_atomic(&hlpip->data);
>> +
>> +    /*
>> +     * Unmapped events are marked with an invalid LPI ID. We can safely
>> +     * ignore them, as they have no further state and no-one can expect
>> +     * to see them if they have not been mapped.
>> +     */
>> +    if ( hlpi.virt_lpi == INVALID_LPI )
>> +        return;
>> +
>> +    d = rcu_lock_domain_by_id(hlpi.dom_id);
>> +    if ( !d )
>> +        return;
>> +
>> +    /* Make sure we don't step beyond the vcpu array. */
>> +    if ( hlpi.vcpu_id >= d->max_vcpus )
>> +    {
>> +        rcu_unlock_domain(d);
>> +        return;
>> +    }
>> +
>> +    vcpu = d->vcpu[hlpi.vcpu_id];
>> +
>> +    /* Check if the VCPU is ready to receive LPIs. */
>> +    if ( vcpu->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )
>> +        vgic_vcpu_inject_irq(vcpu, hlpi.virt_lpi);
>> +
>> +    rcu_unlock_domain(d);
>> +}
>> +
>>  static int gicv3_lpi_allocate_pendtable(uint64_t *reg)
>>  {
>>      uint64_t val;
>> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
>> index a559e5e..63dbc21 100644
>> --- a/xen/arch/arm/gic-v3.c
>> +++ b/xen/arch/arm/gic-v3.c
>> @@ -1670,6 +1670,7 @@ static const struct gic_hw_operations gicv3_ops = {
>>      .make_hwdom_dt_node  = gicv3_make_hwdom_dt_node,
>>      .make_hwdom_madt     = gicv3_make_hwdom_madt,
>>      .iomem_deny_access   = gicv3_iomem_deny_access,
>> +    .do_LPI              = gicv3_do_LPI,
>>  };
>>  
>>  static int __init gicv3_dt_preinit(struct dt_device_node *node, const void *data)
>> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
>> index 9522c6c..a56be34 100644
>> --- a/xen/arch/arm/gic.c
>> +++ b/xen/arch/arm/gic.c
>> @@ -697,7 +697,13 @@ void gic_interrupt(struct cpu_user_regs *regs, int is_fiq)
>>              do_IRQ(regs, irq, is_fiq);
>>              local_irq_disable();
>>          }
>> -        else if (unlikely(irq < 16))
>> +        else if ( is_lpi(irq) )
>> +        {
>> +            local_irq_enable();
>> +            gic_hw_ops->do_LPI(irq);
>> +            local_irq_disable();
>> +        }
>> +        else if ( unlikely(irq < 16) )
>>          {
>>              do_sgi(regs, irq);
>>          }
>> diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
>> index 3cad68f..5f4c5ad 100644
>> --- a/xen/arch/arm/vgic-v2.c
>> +++ b/xen/arch/arm/vgic-v2.c
>> @@ -710,11 +710,18 @@ static struct pending_irq *vgic_v2_lpi_to_pending(struct domain *d,
>>      BUG();
>>  }
>>  
>> +static int vgic_v2_lpi_get_priority(struct domain *d, unsigned int vlpi)
>> +{
>> +    /* Dummy function, no LPIs on a VGICv2. */
>> +    BUG();
>> +}
>> +
>>  static const struct vgic_ops vgic_v2_ops = {
>>      .vcpu_init   = vgic_v2_vcpu_init,
>>      .domain_init = vgic_v2_domain_init,
>>      .domain_free = vgic_v2_domain_free,
>>      .lpi_to_pending = vgic_v2_lpi_to_pending,
>> +    .lpi_get_priority = vgic_v2_lpi_get_priority,
>>      .max_vcpus = 8,
>>  };
>>  
>> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
>> index 5128f13..2a14305 100644
>> --- a/xen/arch/arm/vgic-v3.c
>> +++ b/xen/arch/arm/vgic-v3.c
>> @@ -1548,12 +1548,24 @@ struct pending_irq *vgic_v3_lpi_to_pending(struct domain *d, unsigned int lpi)
>>      return pirq;
>>  }
>>  
>> +/* Retrieve the priority of an LPI from its struct pending_irq. */
>> +int vgic_v3_lpi_get_priority(struct domain *d, uint32_t vlpi)
>> +{
>> +    struct pending_irq *p = vgic_v3_lpi_to_pending(d, vlpi);
>> +
>> +    if ( !p )
>> +        return GIC_PRI_IRQ;
>> +
>> +    return p->lpi_priority;
>> +}
>> +
>>  static const struct vgic_ops v3_ops = {
>>      .vcpu_init   = vgic_v3_vcpu_init,
>>      .domain_init = vgic_v3_domain_init,
>>      .domain_free = vgic_v3_domain_free,
>>      .emulate_reg  = vgic_v3_emulate_reg,
>>      .lpi_to_pending = vgic_v3_lpi_to_pending,
>> +    .lpi_get_priority = vgic_v3_lpi_get_priority,
>>      /*
>>       * We use both AFF1 and AFF0 in (v)MPIDR. Thus, the max number of CPU
>>       * that can be supported is up to 4096(==256*16) in theory.
>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
>> index d704d7c..cd9a2a5 100644
>> --- a/xen/arch/arm/vgic.c
>> +++ b/xen/arch/arm/vgic.c
>> @@ -226,10 +226,15 @@ struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq)
>>  
>>  static int vgic_get_virq_priority(struct vcpu *v, unsigned int virq)
>>  {
>> -    struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
>> +    struct vgic_irq_rank *rank;
>>      unsigned long flags;
>>      int priority;
>>  
>> +    /* LPIs don't have a rank, also store their priority separately. */
>> +    if ( is_lpi(virq) )
>> +        return v->domain->arch.vgic.handler->lpi_get_priority(v->domain, virq);
>> +
>> +    rank = vgic_rank_irq(v, virq);
>>      vgic_lock_rank(v, rank, flags);
>>      priority = rank->priority[virq & INTERRUPT_RANK_MASK];
>>      vgic_unlock_rank(v, rank, flags);
>> diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
>> index 836a103..42963c0 100644
>> --- a/xen/include/asm-arm/gic.h
>> +++ b/xen/include/asm-arm/gic.h
>> @@ -366,6 +366,8 @@ struct gic_hw_operations {
>>      int (*map_hwdom_extra_mappings)(struct domain *d);
>>      /* Deny access to GIC regions */
>>      int (*iomem_deny_access)(const struct domain *d);
>> +    /* Handle LPIs, which require special handling */
>> +    void (*do_LPI)(unsigned int lpi);
>>  };
>>  
>>  void register_gic_ops(const struct gic_hw_operations *ops);
>> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
>> index 3b7c724..eb71fbd 100644
>> --- a/xen/include/asm-arm/gic_v3_its.h
>> +++ b/xen/include/asm-arm/gic_v3_its.h
>> @@ -139,6 +139,8 @@ void gicv3_its_dt_init(const struct dt_device_node *node);
>>  
>>  bool gicv3_its_host_has_its(void);
>>  
>> +void gicv3_do_LPI(unsigned int lpi);
>> +
>>  int gicv3_lpi_init_rdist(void __iomem * rdist_base);
>>  
>>  /* Initialize the host structures for LPIs and the host ITSes. */
>> @@ -182,6 +184,12 @@ static inline bool gicv3_its_host_has_its(void)
>>      return false;
>>  }
>>  
>> +static inline void gicv3_do_LPI(unsigned int lpi)
>> +{
>> +    /* We don't enable LPIs without an ITS. */
>> +    BUG();
>> +}
>> +
>>  static inline int gicv3_lpi_init_rdist(void __iomem * rdist_base)
>>  {
>>      return -ENODEV;
>> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
>> index 7c86f5b..08d6294 100644
>> --- a/xen/include/asm-arm/vgic.h
>> +++ b/xen/include/asm-arm/vgic.h
>> @@ -66,12 +66,14 @@ struct pending_irq
>>  #define GIC_IRQ_GUEST_VISIBLE  2
>>  #define GIC_IRQ_GUEST_ENABLED  3
>>  #define GIC_IRQ_GUEST_MIGRATING   4
>> +#define GIC_IRQ_GUEST_LPI_PENDING 5     /* Caches the pending bit of an LPI. */
> 
> I don't think we need GIC_IRQ_GUEST_LPI_PENDING, you can reuse
> GIC_IRQ_GUEST_QUEUED and list_empty(&n->inflight): if you call
> vgic_vcpu_inject_irq passing an irq as argument that is not enabled, it
> will get GIC_IRQ_GUEST_QUEUED and added to inflight, but not injected.

Mmh, that's an interesting suggestion.
The neat thing about piggy-backing on those status bits is that we can
use the atomic set_bit/clear_bit to set and clear the pending state,
independently from any code accessing the structure, even without taking
any lock. Especially since the existing VGIC code does not care about
this bit.
However looking at all users of GUEST_LPI_PENDING, I see that we either
already hold the VGIC lock and we are able to take it, so we might be
able to use the combination you suggested.

I tried to make some sense of what "inflight" *really* means, can you
shed some light on this?
My understanding is that it's about IRQs that should be injected, but
are not ready (because they are disabled). Are there other cases IRQs
are in the "inflight" list? Is that also the spillover in case all LRs
are used (in this case they would GIC_IRQ_GUEST_ENABLED)?

>>      unsigned long status;
>>      struct irq_desc *desc; /* only set it the irq corresponds to a physical irq */
>>      unsigned int irq;
>>  #define GIC_INVALID_LR         (uint8_t)~0
>>      uint8_t lr;
>>      uint8_t priority;
>> +    uint8_t lpi_priority;       /* Caches the priority if this is an LPI. */
> 
> The commit message says: "we enhance struct pending_irq to cache the
> pending bit and the priority information for LPIs, as we can't afford to
> walk the tables in guest memory every time we handle an incoming LPI." I
> thought it would be direct access, having the vlpi number in our hands?
> Why is it a problem?
> 
> If there has been a conversation about this that I am missing, please
> provide a link, I'll go back and read it.

Well, the property table is in guest memory as the other ITS tables and
we now access this in a new way (vgic_access_guest_memory()), which is
quite costly: we need to do the p2m lookup, map the page, access the
data, unmap the page and put the page back. Everything on itself is not
really too demanding (on arm64), but doing this in the interrupt
handling path to learn the priority value sounds a bit over the top.
For the *ITS* emulation (command handling) we can cope with this
overhead (because this is only needing during the virtual command
handling), but I wanted to avoid this in the generic GIC code, which is
also a hot path, as I understand.
Since the GICv3 architecture explicitly allows caching this information,
I wanted to use this opportunity.

Does that make sense?

Cheers,
Andre.

> 
>>      /* inflight is used to append instances of pending_irq to
>>       * vgic.inflight_irqs */
>>      struct list_head inflight;
>> @@ -136,6 +138,7 @@ struct vgic_ops {
>>      bool (*emulate_reg)(struct cpu_user_regs *regs, union hsr hsr);
>>      /* lookup the struct pending_irq for a given LPI interrupt */
>>      struct pending_irq *(*lpi_to_pending)(struct domain *d, unsigned int vlpi);
>> +    int (*lpi_get_priority)(struct domain *d, uint32_t vlpi);
>>      /* Maximum number of vCPU supported */
>>      const unsigned int max_vcpus;
>>  };
>> -- 
>> 2.8.2
>>

Julien Grall April 6, 2017, 6:10 p.m. UTC | #3

Hi Andre,

On 06/04/17 00:19, Andre Przywara wrote:
> diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
> index 270a136..f4d7949 100644
> --- a/xen/arch/arm/gic-v2.c
> +++ b/xen/arch/arm/gic-v2.c
> @@ -1217,6 +1217,12 @@ static int __init gicv2_init(void)
>      return 0;
>  }
>
> +void gicv2_do_LPI(unsigned int lpi)

This should be static.

> +{
> +    /* No LPIs in a GICv2 */
> +    BUG();
> +}
> +

[...]

> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> index 0785701..d8baebc 100644
> --- a/xen/arch/arm/gic-v3-lpi.c
> +++ b/xen/arch/arm/gic-v3-lpi.c
> @@ -136,6 +136,62 @@ uint64_t gicv3_get_redist_address(unsigned int cpu, bool use_pta)
>          return per_cpu(lpi_redist, cpu).redist_id << 16;
>  }
>
> +/*
> + * Handle incoming LPIs, which are a bit special, because they are potentially
> + * numerous and also only get injected into guests. Treat them specially here,
> + * by just looking up their target vCPU and virtual LPI number and hand it
> + * over to the injection function.
> + * Please note that LPIs are edge-triggered only, also have no active state,
> + * so spurious interrupts on the host side are no issue (we can just ignore
> + * them).
> + * Also a guest cannot expect that firing interrupts that haven't been
> + * fully configured yet will reach the CPU, so we don't need to care about
> + * this special case.
> + */
> +void gicv3_do_LPI(unsigned int lpi)

Ditto.

> +{
> +    struct domain *d;
> +    union host_lpi *hlpip, hlpi;
> +    struct vcpu *vcpu;
> +

As mentioned on the previous version, you will need irq_enter and 
irq_exit in the return path. So the common code know you are in an 
interrupt handler (see in_irq()).

[...]

> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index 5128f13..2a14305 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -1548,12 +1548,24 @@ struct pending_irq *vgic_v3_lpi_to_pending(struct domain *d, unsigned int lpi)
>      return pirq;
>  }
>
> +/* Retrieve the priority of an LPI from its struct pending_irq. */
> +int vgic_v3_lpi_get_priority(struct domain *d, uint32_t vlpi)

This should be static.

> +{
> +    struct pending_irq *p = vgic_v3_lpi_to_pending(d, vlpi);
> +
> +    if ( !p )
> +        return GIC_PRI_IRQ;
> +
> +    return p->lpi_priority;
> +}
> +
>  static const struct vgic_ops v3_ops = {
>      .vcpu_init   = vgic_v3_vcpu_init,
>      .domain_init = vgic_v3_domain_init,
>      .domain_free = vgic_v3_domain_free,
>      .emulate_reg  = vgic_v3_emulate_reg,
>      .lpi_to_pending = vgic_v3_lpi_to_pending,
> +    .lpi_get_priority = vgic_v3_lpi_get_priority,
>      /*
>       * We use both AFF1 and AFF0 in (v)MPIDR. Thus, the max number of CPU
>       * that can be supported is up to 4096(==256*16) in theory.

[...]

> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
> index 7c86f5b..08d6294 100644
> --- a/xen/include/asm-arm/vgic.h
> +++ b/xen/include/asm-arm/vgic.h
> @@ -66,12 +66,14 @@ struct pending_irq
>  #define GIC_IRQ_GUEST_VISIBLE  2
>  #define GIC_IRQ_GUEST_ENABLED  3
>  #define GIC_IRQ_GUEST_MIGRATING   4
> +#define GIC_IRQ_GUEST_LPI_PENDING 5     /* Caches the pending bit of an LPI. */

Please use the big comment above to describe GUEST_LPI_PENDING.

>      unsigned long status;
>      struct irq_desc *desc; /* only set it the irq corresponds to a physical irq */
>      unsigned int irq;
>  #define GIC_INVALID_LR         (uint8_t)~0
>      uint8_t lr;
>      uint8_t priority;
> +    uint8_t lpi_priority;       /* Caches the priority if this is an LPI. */
>      /* inflight is used to append instances of pending_irq to
>       * vgic.inflight_irqs */
>      struct list_head inflight;
> @@ -136,6 +138,7 @@ struct vgic_ops {
>      bool (*emulate_reg)(struct cpu_user_regs *regs, union hsr hsr);
>      /* lookup the struct pending_irq for a given LPI interrupt */
>      struct pending_irq *(*lpi_to_pending)(struct domain *d, unsigned int vlpi);
> +    int (*lpi_get_priority)(struct domain *d, uint32_t vlpi);
>      /* Maximum number of vCPU supported */
>      const unsigned int max_vcpus;
>  };
>

Cheers,

Stefano Stabellini April 6, 2017, 6:47 p.m. UTC | #4

On Thu, 6 Apr 2017, Andre Przywara wrote:
> Hi Stefano,
> 
> thanks for spending your brain cells on this nasty code.
> 
> ...
> 
> On 06/04/17 00:45, Stefano Stabellini wrote:
> > On Thu, 6 Apr 2017, Andre Przywara wrote:
> >> Upon receiving an LPI on the host, we need to find the right VCPU and
> >> virtual IRQ number to get this IRQ injected.
> >> Iterate our two-level LPI table to find this information quickly when
> >> the host takes an LPI. Call the existing injection function to let the
> >> GIC emulation deal with this interrupt.
> >> Also we enhance struct pending_irq to cache the pending bit and the
> >> priority information for LPIs, as we can't afford to walk the tables in
> >> guest memory every time we handle an incoming LPI.
> >> This introduces a do_LPI() as a hardware gic_ops and a function to
> >> retrieve the (cached) priority value of an LPI and a vgic_ops.
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >>
> >> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
> >> index 7c86f5b..08d6294 100644
> >> --- a/xen/include/asm-arm/vgic.h
> >> +++ b/xen/include/asm-arm/vgic.h
> >> @@ -66,12 +66,14 @@ struct pending_irq
> >>  #define GIC_IRQ_GUEST_VISIBLE  2
> >>  #define GIC_IRQ_GUEST_ENABLED  3
> >>  #define GIC_IRQ_GUEST_MIGRATING   4
> >> +#define GIC_IRQ_GUEST_LPI_PENDING 5     /* Caches the pending bit of an LPI. */
> > 
> > I don't think we need GIC_IRQ_GUEST_LPI_PENDING, you can reuse
> > GIC_IRQ_GUEST_QUEUED and list_empty(&n->inflight): if you call
> > vgic_vcpu_inject_irq passing an irq as argument that is not enabled, it
> > will get GIC_IRQ_GUEST_QUEUED and added to inflight, but not injected.
> 
> Mmh, that's an interesting suggestion.
> The neat thing about piggy-backing on those status bits is that we can
> use the atomic set_bit/clear_bit to set and clear the pending state,
> independently from any code accessing the structure, even without taking
> any lock. Especially since the existing VGIC code does not care about
> this bit.
> However looking at all users of GUEST_LPI_PENDING, I see that we either
> already hold the VGIC lock and we are able to take it, so we might be
> able to use the combination you suggested.
> 
> I tried to make some sense of what "inflight" *really* means, can you
> shed some light on this?

See the comment in xen/include/asm-arm/vgic.h.

vgic_vcpu_inject_irq(irq) is called:
- GIC_IRQ_GUEST_QUEUED is set
- pending_irq is added to the inflight list

At this stage the irq is tracked as being inflight, but not really
injected into the guest yet.

Once/If the irq is enabled, we try to inject it into the guest.
gic_raise_guest_irq(irq) is called:
- if there is a free LR
  - clear GIC_IRQ_GUEST_QUEUED
  - set GIC_IRQ_GUEST_VISIBLE
- otherwise
  - add pending_irq to lr_pending
    - pending_irq is still on the inflight list too
    - pending_irq will be removed from lr_pending, and the irq added to an LR when
      available

When the guest finishes with the irq and EOIes it:
gic_update_one_lr is called:
- clear LR
- clear GIC_IRQ_GUEST_VISIBLE
- pending_irq is removed from inflight


Thus, you can use list_empty(&p->inflight) and QUEUED to check if the
LPI has been set to PENDING but it hasn't been injected yet, because the
corresponding pending_irq remains on the inflight list until the LPI has
been EOIed. but QUEUED is cleared as soon as the irq is added to an LR.

It is possible to set an interrupt as QUEUED, even if pending_irq is
already inflight and the irq is VISIBLE. That just means that the
physical irq has been reasserted after the virq becomes ACTIVE and
before the EOI.


> My understanding is that it's about IRQs that should be injected, but
> are not ready (because they are disabled). Are there other cases IRQs
> are in the "inflight" list? Is that also the spillover in case all LRs
> are used (in this case they would GIC_IRQ_GUEST_ENABLED)?

No. If all LRs are used, pending_irq is added to lr_pending. However,
pending_irq *also* remains in the inflight list until EOI.

If (ENABLED && QUEUED && inflight) is the condition to try to inject the
irq into the guest. If list_empty(&v->arch.vgic.lr_pending) is the
condition to actually be able to add the irq to an lr.


> >>      unsigned long status;
> >>      struct irq_desc *desc; /* only set it the irq corresponds to a physical irq */
> >>      unsigned int irq;
> >>  #define GIC_INVALID_LR         (uint8_t)~0
> >>      uint8_t lr;
> >>      uint8_t priority;
> >> +    uint8_t lpi_priority;       /* Caches the priority if this is an LPI. */
> > 
> > The commit message says: "we enhance struct pending_irq to cache the
> > pending bit and the priority information for LPIs, as we can't afford to
> > walk the tables in guest memory every time we handle an incoming LPI." I
> > thought it would be direct access, having the vlpi number in our hands?
> > Why is it a problem?
> > 
> > If there has been a conversation about this that I am missing, please
> > provide a link, I'll go back and read it.
> 
> Well, the property table is in guest memory as the other ITS tables and
> we now access this in a new way (vgic_access_guest_memory()), which is
> quite costly: we need to do the p2m lookup, map the page, access the
> data, unmap the page and put the page back. 

map, unmap and put are (almost) nop. The only operation is the
p2m_lookup that could be easily avoided by storing the struct page_info*
or mfn corresponding to the guest-provided data structures. We should be
able to do this in a very small number of steps, right?


> Everything on itself is not
> really too demanding (on arm64), but doing this in the interrupt
> handling path to learn the priority value sounds a bit over the top.
> For the *ITS* emulation (command handling) we can cope with this
> overhead (because this is only needing during the virtual command
> handling), but I wanted to avoid this in the generic GIC code, which is
> also a hot path, as I understand.
> Since the GICv3 architecture explicitly allows caching this information,
> I wanted to use this opportunity.
> 
> Does that make sense?

Yes, I see your reasoning and you are right that it is very important to
keep the hot path very fast. However, I think we should be able to do
that without duplicating information and adding another field to
pending_irq.

If you think that what I suggested is feasible but it's too complex to
do for now, then I would still drop the new lpi_priority field from
pending_irq, implement the slow version based on
vgic_access_guest_memory for now, and add another TODO comment in the
code.

Julien Grall April 6, 2017, 7:13 p.m. UTC | #5

Hi Stefano,

On 04/06/2017 07:47 PM, Stefano Stabellini wrote:
> On Thu, 6 Apr 2017, Andre Przywara wrote:
>>>>      unsigned long status;
>>>>      struct irq_desc *desc; /* only set it the irq corresponds to a physical irq */
>>>>      unsigned int irq;
>>>>  #define GIC_INVALID_LR         (uint8_t)~0
>>>>      uint8_t lr;
>>>>      uint8_t priority;
>>>> +    uint8_t lpi_priority;       /* Caches the priority if this is an LPI. */
>>>
>>> The commit message says: "we enhance struct pending_irq to cache the
>>> pending bit and the priority information for LPIs, as we can't afford to
>>> walk the tables in guest memory every time we handle an incoming LPI." I
>>> thought it would be direct access, having the vlpi number in our hands?
>>> Why is it a problem?
>>>
>>> If there has been a conversation about this that I am missing, please
>>> provide a link, I'll go back and read it.
>>
>> Well, the property table is in guest memory as the other ITS tables and
>> we now access this in a new way (vgic_access_guest_memory()), which is
>> quite costly: we need to do the p2m lookup, map the page, access the
>> data, unmap the page and put the page back.
>
> map, unmap and put are (almost) nop. The only operation is the
> p2m_lookup that could be easily avoided by storing the struct page_info*
> or mfn corresponding to the guest-provided data structures. We should be
> able to do this in a very small number of steps, right?

The property table could be really big (up to the number of LPIs 
supported by the guest). The memory is contiguous from a domain point of 
view but not necessarily on the host. So you would have to store a big 
number of MFN. IIRC the property table can be up to 4GB, so we would 
need an array of 8MB.

Also reading from the guest would require some safety which is not 
necessary here.

Furthermore, we have space in pending_irq because of padding.

So why bothering doing that?

>
>
>> Everything on itself is not
>> really too demanding (on arm64), but doing this in the interrupt
>> handling path to learn the priority value sounds a bit over the top.
>> For the *ITS* emulation (command handling) we can cope with this
>> overhead (because this is only needing during the virtual command
>> handling), but I wanted to avoid this in the generic GIC code, which is
>> also a hot path, as I understand.
>> Since the GICv3 architecture explicitly allows caching this information,
>> I wanted to use this opportunity.
>>
>> Does that make sense?
>
> Yes, I see your reasoning and you are right that it is very important to
> keep the hot path very fast. However, I think we should be able to do
> that without duplicating information and adding another field to
> pending_irq.

Why? We should not trust information living in the guest memory. This is 
a call to attack Xen. So this means more safety.

Cheers,

Stefano Stabellini April 6, 2017, 7:47 p.m. UTC | #6

On Thu, 6 Apr 2017, Julien Grall wrote:
> On 04/06/2017 07:47 PM, Stefano Stabellini wrote:
> > On Thu, 6 Apr 2017, Andre Przywara wrote:
> > > > >      unsigned long status;
> > > > >      struct irq_desc *desc; /* only set it the irq corresponds to a
> > > > > physical irq */
> > > > >      unsigned int irq;
> > > > >  #define GIC_INVALID_LR         (uint8_t)~0
> > > > >      uint8_t lr;
> > > > >      uint8_t priority;
> > > > > +    uint8_t lpi_priority;       /* Caches the priority if this is an
> > > > > LPI. */
> > > > 
> > > > The commit message says: "we enhance struct pending_irq to cache the
> > > > pending bit and the priority information for LPIs, as we can't afford to
> > > > walk the tables in guest memory every time we handle an incoming LPI." I
> > > > thought it would be direct access, having the vlpi number in our hands?
> > > > Why is it a problem?
> > > > 
> > > > If there has been a conversation about this that I am missing, please
> > > > provide a link, I'll go back and read it.
> > > 
> > > Well, the property table is in guest memory as the other ITS tables and
> > > we now access this in a new way (vgic_access_guest_memory()), which is
> > > quite costly: we need to do the p2m lookup, map the page, access the
> > > data, unmap the page and put the page back.
> > 
> > map, unmap and put are (almost) nop. The only operation is the
> > p2m_lookup that could be easily avoided by storing the struct page_info*
> > or mfn corresponding to the guest-provided data structures. We should be
> > able to do this in a very small number of steps, right?
> 
> The property table could be really big (up to the number of LPIs supported by
> the guest). The memory is contiguous from a domain point of view but not
> necessarily on the host. So you would have to store a big number of MFN. IIRC
> the property table can be up to 4GB, so we would need an array of 8MB.

Yes, but in that scenario, with so many LPIs, the new lpi_priority field
will use overall 4GB. In comparison 8MB sounds awesome. But I didn't
notice that it was using the padding space.


> Also reading from the guest would require some safety which is not necessary
> here.
> 
> Furthermore, we have space in pending_irq because of padding.
> 
> So why bothering doing that?

I didn't notice that it was taking over some of the padding. That is
much better. I can live with it then.

Julien Grall April 6, 2017, 7:54 p.m. UTC | #7

Hi Stefano,

On 04/06/2017 08:47 PM, Stefano Stabellini wrote:
> On Thu, 6 Apr 2017, Julien Grall wrote:
>> On 04/06/2017 07:47 PM, Stefano Stabellini wrote:
>>> On Thu, 6 Apr 2017, Andre Przywara wrote:
>>>>>>      unsigned long status;
>>>>>>      struct irq_desc *desc; /* only set it the irq corresponds to a
>>>>>> physical irq */
>>>>>>      unsigned int irq;
>>>>>>  #define GIC_INVALID_LR         (uint8_t)~0
>>>>>>      uint8_t lr;
>>>>>>      uint8_t priority;
>>>>>> +    uint8_t lpi_priority;       /* Caches the priority if this is an
>>>>>> LPI. */
>>>>>
>>>>> The commit message says: "we enhance struct pending_irq to cache the
>>>>> pending bit and the priority information for LPIs, as we can't afford to
>>>>> walk the tables in guest memory every time we handle an incoming LPI." I
>>>>> thought it would be direct access, having the vlpi number in our hands?
>>>>> Why is it a problem?
>>>>>
>>>>> If there has been a conversation about this that I am missing, please
>>>>> provide a link, I'll go back and read it.
>>>>
>>>> Well, the property table is in guest memory as the other ITS tables and
>>>> we now access this in a new way (vgic_access_guest_memory()), which is
>>>> quite costly: we need to do the p2m lookup, map the page, access the
>>>> data, unmap the page and put the page back.
>>>
>>> map, unmap and put are (almost) nop. The only operation is the
>>> p2m_lookup that could be easily avoided by storing the struct page_info*
>>> or mfn corresponding to the guest-provided data structures. We should be
>>> able to do this in a very small number of steps, right?
>>
>> The property table could be really big (up to the number of LPIs supported by
>> the guest). The memory is contiguous from a domain point of view but not
>> necessarily on the host. So you would have to store a big number of MFN. IIRC
>> the property table can be up to 4GB, so we would need an array of 8MB.
>
> Yes, but in that scenario, with so many LPIs, the new lpi_priority field
> will use overall 4GB. In comparison 8MB sounds awesome. But I didn't
> notice that it was using the padding space.
>
>
>> Also reading from the guest would require some safety which is not necessary
>> here.
>>
>> Furthermore, we have space in pending_irq because of padding.
>>
>> So why bothering doing that?
>
> I didn't notice that it was taking over some of the padding. That is
> much better. I can live with it then.

Would a comment /* This field is only used for caching LPI priority. 
This could be removed and read from the guest memory if we need space. 
*/ be useful?

Cheers,

Stefano Stabellini April 6, 2017, 8:29 p.m. UTC | #8

On Thu, 6 Apr 2017, Julien Grall wrote:
> Hi Stefano,
> 
> On 04/06/2017 08:47 PM, Stefano Stabellini wrote:
> > On Thu, 6 Apr 2017, Julien Grall wrote:
> > > On 04/06/2017 07:47 PM, Stefano Stabellini wrote:
> > > > On Thu, 6 Apr 2017, Andre Przywara wrote:
> > > > > > >      unsigned long status;
> > > > > > >      struct irq_desc *desc; /* only set it the irq corresponds to
> > > > > > > a
> > > > > > > physical irq */
> > > > > > >      unsigned int irq;
> > > > > > >  #define GIC_INVALID_LR         (uint8_t)~0
> > > > > > >      uint8_t lr;
> > > > > > >      uint8_t priority;
> > > > > > > +    uint8_t lpi_priority;       /* Caches the priority if this is
> > > > > > > an
> > > > > > > LPI. */
> > > > > > 
> > > > > > The commit message says: "we enhance struct pending_irq to cache the
> > > > > > pending bit and the priority information for LPIs, as we can't
> > > > > > afford to
> > > > > > walk the tables in guest memory every time we handle an incoming
> > > > > > LPI." I
> > > > > > thought it would be direct access, having the vlpi number in our
> > > > > > hands?
> > > > > > Why is it a problem?
> > > > > > 
> > > > > > If there has been a conversation about this that I am missing,
> > > > > > please
> > > > > > provide a link, I'll go back and read it.
> > > > > 
> > > > > Well, the property table is in guest memory as the other ITS tables
> > > > > and
> > > > > we now access this in a new way (vgic_access_guest_memory()), which is
> > > > > quite costly: we need to do the p2m lookup, map the page, access the
> > > > > data, unmap the page and put the page back.
> > > > 
> > > > map, unmap and put are (almost) nop. The only operation is the
> > > > p2m_lookup that could be easily avoided by storing the struct page_info*
> > > > or mfn corresponding to the guest-provided data structures. We should be
> > > > able to do this in a very small number of steps, right?
> > > 
> > > The property table could be really big (up to the number of LPIs supported
> > > by
> > > the guest). The memory is contiguous from a domain point of view but not
> > > necessarily on the host. So you would have to store a big number of MFN.
> > > IIRC
> > > the property table can be up to 4GB, so we would need an array of 8MB.
> > 
> > Yes, but in that scenario, with so many LPIs, the new lpi_priority field
> > will use overall 4GB. In comparison 8MB sounds awesome. But I didn't
> > notice that it was using the padding space.
> > 
> > 
> > > Also reading from the guest would require some safety which is not
> > > necessary
> > > here.
> > > 
> > > Furthermore, we have space in pending_irq because of padding.
> > > 
> > > So why bothering doing that?
> > 
> > I didn't notice that it was taking over some of the padding. That is
> > much better. I can live with it then.
> 
> Would a comment /* This field is only used for caching LPI priority. This
> could be removed and read from the guest memory if we need space. */ be
> useful?

No need for a comment but the following sentence in the commit message
is a bit misleading:

"as we can't afford to walk the tables in guest memory every time we
handle an incoming LPI."

I would just rewrite it to say that it's faster than accessing guest
tables, and it doesn't require any more memory as we are exploiting some
of the bytes used for padding.

[v5,13/30] ARM: GICv3: forward pending LPIs to guests

Commit Message

Comments

Patch