Message ID | 20170511175340.8448-21-andre.przywara@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Andre, On 11/05/17 18:53, Andre Przywara wrote: > When LPIs get unmapped by a guest, they might still be in some LR of > some VCPU. Nevertheless we remove the corresponding pending_irq > (possibly freeing it), and detect this case (irq_to_pending() returns > NULL) when the LR gets cleaned up later. > However a *new* LPI may get mapped with the same number while the old > LPI is *still* in some LR. To avoid getting the wrong state, we mark > every newly mapped LPI as PRISTINE, which means: has never been in an > LR before. If we detect the LPI in an LR anyway, it must have been an > older one, which we can simply retire. > Before inserting such a PRISTINE LPI into an LR, we must make sure that > it's not already in another LR, as the architecture forbids two > interrupts with the same virtual IRQ number on one CPU. > > Signed-off-by: Andre Przywara <andre.przywara@arm.com> > --- > xen/arch/arm/gic.c | 55 +++++++++++++++++++++++++++++++++++++++++----- > xen/include/asm-arm/vgic.h | 6 +++++ > 2 files changed, 56 insertions(+), 5 deletions(-) > > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > index fd3fa05..8bf0578 100644 > --- a/xen/arch/arm/gic.c > +++ b/xen/arch/arm/gic.c > @@ -375,6 +375,8 @@ static inline void gic_set_lr(int lr, struct pending_irq *p, > { > ASSERT(!local_irq_is_enabled()); > > + clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status); > + > gic_hw_ops->update_lr(lr, p, state); > > set_bit(GIC_IRQ_GUEST_VISIBLE, &p->status); > @@ -442,12 +444,41 @@ void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq) > #endif > } > > +/* > + * Find an unused LR to insert an IRQ into. If this new interrupt is a > + * PRISTINE LPI, scan the other LRs to avoid inserting the same IRQ twice. This replicate here a part of the commit message regarding the pending_irq structure. So we have background in the code of why going through the LRs. > + */ > +static int gic_find_unused_lr(struct vcpu *v, struct pending_irq *p, int lr) This should be unsigned for both the return and the lr variable. Also please explain the goal of the variable because it is not clear from the callers. > +{ > + unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; > + unsigned long *lr_mask = (unsigned long *) &this_cpu(lr_mask); > + struct gic_lr lr_val; > + > + ASSERT(spin_is_locked(&v->arch.vgic.lock)); > + > + if ( test_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) > + { > + int used_lr = 0; find_next_bit return unsigned long. So likely you want to use either unsigned int or unsigned long here. > + > + while ( (used_lr = find_next_bit(lr_mask, nr_lrs, used_lr)) < nr_lrs ) > + { > + gic_hw_ops->read_lr(used_lr, &lr_val); > + if ( lr_val.virq == p->irq ) > + return used_lr; > + } > + } > + > + lr = find_next_zero_bit(lr_mask, nr_lrs, lr); > + > + return lr; > +} > + > void gic_raise_guest_irq(struct vcpu *v, unsigned int virtual_irq, > unsigned int priority) > { > - int i; > - unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; > struct pending_irq *p = irq_to_pending(v, virtual_irq); > + unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; Why did you move around the variable? Please avoid such things, it is not like the ITS series is easy to review... > + int i = nr_lrs; I don't understand why you initialize i to nr_lrs. i will always have the value reset by the return of gic_find_unused_lr before been used. > > ASSERT(spin_is_locked(&v->arch.vgic.lock)); > > @@ -457,7 +488,8 @@ void gic_raise_guest_irq(struct vcpu *v, unsigned int virtual_irq, > > if ( v == current && list_empty(&v->arch.vgic.lr_pending) ) > { > - i = find_first_zero_bit(&this_cpu(lr_mask), nr_lrs); > + i = gic_find_unused_lr(v, p, 0); > + > if (i < nr_lrs) { > set_bit(i, &this_cpu(lr_mask)); > gic_set_lr(i, p, GICH_LR_PENDING); > @@ -509,7 +541,17 @@ static void gic_update_one_lr(struct vcpu *v, int i) > } > else if ( lr_val.state & GICH_LR_PENDING ) > { > - int q __attribute__ ((unused)) = test_and_clear_bit(GIC_IRQ_GUEST_QUEUED, &p->status); > + int q __attribute__ ((unused)); > + > + if ( test_and_clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) > + { > + gic_hw_ops->clear_lr(i); > + clear_bit(i, &this_cpu(lr_mask)); > + > + return; > + } This code is very similar to what you do at the beginning when pending_irq is NULL. I would prefer if you stick the check there rather than try to address in all the different paths. I know it is not necessary to do it for the active path as LPI does not have active state, though this would have require an explanation in the commit message). But it is easier to maintain the test_and_clear_bit in a single place rather than spreading in 2 different one for little benefits. > + > + q = test_and_clear_bit(GIC_IRQ_GUEST_QUEUED, &p->status); > #ifdef GIC_DEBUG > if ( q ) > gdprintk(XENLOG_DEBUG, "trying to inject irq=%d into d%dv%d, when it is already pending in LR%d\n", > @@ -521,6 +563,9 @@ static void gic_update_one_lr(struct vcpu *v, int i) > gic_hw_ops->clear_lr(i); > clear_bit(i, &this_cpu(lr_mask)); > > + if ( test_and_clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) > + return; > + This would avoid this check too. > if ( p->desc != NULL ) > clear_bit(_IRQ_INPROGRESS, &p->desc->status); > clear_bit(GIC_IRQ_GUEST_VISIBLE, &p->status); > @@ -591,7 +636,7 @@ static void gic_restore_pending_irqs(struct vcpu *v) > inflight_r = &v->arch.vgic.inflight_irqs; > list_for_each_entry_safe ( p, t, &v->arch.vgic.lr_pending, lr_queue ) > { > - lr = find_next_zero_bit(&this_cpu(lr_mask), nr_lrs, lr); > + lr = gic_find_unused_lr(v, p, lr); > if ( lr >= nr_lrs ) > { > /* No more free LRs: find a lower priority irq to evict */ > diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h > index 02732db..3fc4ceb 100644 > --- a/xen/include/asm-arm/vgic.h > +++ b/xen/include/asm-arm/vgic.h > @@ -60,12 +60,18 @@ struct pending_irq > * vcpu while it is still inflight and on an GICH_LR register on the > * old vcpu. > * > + * GIC_IRQ_GUEST_PRISTINE_LPI: the IRQ is a newly mapped LPI, which > + * has never been in an LR before. This means that any trace of an > + * LPI with the same number in an LR must be from an older LPI, which > + * has been unmapped before. > + * > */ > #define GIC_IRQ_GUEST_QUEUED 0 > #define GIC_IRQ_GUEST_ACTIVE 1 > #define GIC_IRQ_GUEST_VISIBLE 2 > #define GIC_IRQ_GUEST_ENABLED 3 > #define GIC_IRQ_GUEST_MIGRATING 4 > +#define GIC_IRQ_GUEST_PRISTINE_LPI 5 > unsigned long status; > struct irq_desc *desc; /* only set it the irq corresponds to a physical irq */ > unsigned int irq; > Cheers,
On Thu, 11 May 2017, Andre Przywara wrote: > When LPIs get unmapped by a guest, they might still be in some LR of > some VCPU. Nevertheless we remove the corresponding pending_irq > (possibly freeing it), and detect this case (irq_to_pending() returns > NULL) when the LR gets cleaned up later. > However a *new* LPI may get mapped with the same number while the old > LPI is *still* in some LR. To avoid getting the wrong state, we mark > every newly mapped LPI as PRISTINE, which means: has never been in an > LR before. If we detect the LPI in an LR anyway, it must have been an > older one, which we can simply retire. > Before inserting such a PRISTINE LPI into an LR, we must make sure that > it's not already in another LR, as the architecture forbids two > interrupts with the same virtual IRQ number on one CPU. > > Signed-off-by: Andre Przywara <andre.przywara@arm.com> > --- > xen/arch/arm/gic.c | 55 +++++++++++++++++++++++++++++++++++++++++----- > xen/include/asm-arm/vgic.h | 6 +++++ > 2 files changed, 56 insertions(+), 5 deletions(-) > > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > index fd3fa05..8bf0578 100644 > --- a/xen/arch/arm/gic.c > +++ b/xen/arch/arm/gic.c > @@ -375,6 +375,8 @@ static inline void gic_set_lr(int lr, struct pending_irq *p, > { > ASSERT(!local_irq_is_enabled()); > > + clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status); > + > gic_hw_ops->update_lr(lr, p, state); > > set_bit(GIC_IRQ_GUEST_VISIBLE, &p->status); > @@ -442,12 +444,41 @@ void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq) > #endif > } > > +/* > + * Find an unused LR to insert an IRQ into. If this new interrupt is a > + * PRISTINE LPI, scan the other LRs to avoid inserting the same IRQ twice. > + */ > +static int gic_find_unused_lr(struct vcpu *v, struct pending_irq *p, int lr) > +{ > + unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; > + unsigned long *lr_mask = (unsigned long *) &this_cpu(lr_mask); > + struct gic_lr lr_val; > + > + ASSERT(spin_is_locked(&v->arch.vgic.lock)); > + > + if ( test_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) Maybe we should add an "unlikely". I can see how this would be OKish at runtime, but at boot time there might be a bunch of PRISTINE_LPIs, but no MAPDs have been issued yet, right? I have a suggestion, I'll leave it to you and Julien if you want to do this now, or maybe consider it as a TODO item. I am OK either way (I don't want to delay the ITS any longer). I am thinking we should do this scanning only after at least one MAPD has been issued for a given cpu at least once. I would resurrect the idea of a DISCARD flag, but not on the pending_irq, that I believe it's difficult to handle, but a single global DISCARD flag per struct vcpu. On MAPD, we set DISCARD for the target vcpu of the LPI we are dropping. Next time we want to inject a PRISTINE_IRQ on that cpu interface, we scan all LRs for interrupts with a NULL pending_irq. We remove those from LRs, then we remove the DISCARD flag. Do you think it would work? > + { > + int used_lr = 0; > + > + while ( (used_lr = find_next_bit(lr_mask, nr_lrs, used_lr)) < nr_lrs ) > + { > + gic_hw_ops->read_lr(used_lr, &lr_val); > + if ( lr_val.virq == p->irq ) > + return used_lr; > + } > + } > + > + lr = find_next_zero_bit(lr_mask, nr_lrs, lr); > + > + return lr; > +} > + > void gic_raise_guest_irq(struct vcpu *v, unsigned int virtual_irq, > unsigned int priority) > { > - int i; > - unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; > struct pending_irq *p = irq_to_pending(v, virtual_irq); > + unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; > + int i = nr_lrs; > > ASSERT(spin_is_locked(&v->arch.vgic.lock)); > > @@ -457,7 +488,8 @@ void gic_raise_guest_irq(struct vcpu *v, unsigned int virtual_irq, > > if ( v == current && list_empty(&v->arch.vgic.lr_pending) ) > { > - i = find_first_zero_bit(&this_cpu(lr_mask), nr_lrs); > + i = gic_find_unused_lr(v, p, 0); > + > if (i < nr_lrs) { > set_bit(i, &this_cpu(lr_mask)); > gic_set_lr(i, p, GICH_LR_PENDING); > @@ -509,7 +541,17 @@ static void gic_update_one_lr(struct vcpu *v, int i) > } > else if ( lr_val.state & GICH_LR_PENDING ) > { > - int q __attribute__ ((unused)) = test_and_clear_bit(GIC_IRQ_GUEST_QUEUED, &p->status); > + int q __attribute__ ((unused)); > + > + if ( test_and_clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) > + { > + gic_hw_ops->clear_lr(i); > + clear_bit(i, &this_cpu(lr_mask)); > + > + return; > + } > + > + q = test_and_clear_bit(GIC_IRQ_GUEST_QUEUED, &p->status); > #ifdef GIC_DEBUG > if ( q ) > gdprintk(XENLOG_DEBUG, "trying to inject irq=%d into d%dv%d, when it is already pending in LR%d\n", > @@ -521,6 +563,9 @@ static void gic_update_one_lr(struct vcpu *v, int i) > gic_hw_ops->clear_lr(i); > clear_bit(i, &this_cpu(lr_mask)); > > + if ( test_and_clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) > + return; > if ( p->desc != NULL ) > clear_bit(_IRQ_INPROGRESS, &p->desc->status); > clear_bit(GIC_IRQ_GUEST_VISIBLE, &p->status); > @@ -591,7 +636,7 @@ static void gic_restore_pending_irqs(struct vcpu *v) > inflight_r = &v->arch.vgic.inflight_irqs; > list_for_each_entry_safe ( p, t, &v->arch.vgic.lr_pending, lr_queue ) > { > - lr = find_next_zero_bit(&this_cpu(lr_mask), nr_lrs, lr); > + lr = gic_find_unused_lr(v, p, lr); > if ( lr >= nr_lrs ) > { > /* No more free LRs: find a lower priority irq to evict */ > diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h > index 02732db..3fc4ceb 100644 > --- a/xen/include/asm-arm/vgic.h > +++ b/xen/include/asm-arm/vgic.h > @@ -60,12 +60,18 @@ struct pending_irq > * vcpu while it is still inflight and on an GICH_LR register on the > * old vcpu. > * > + * GIC_IRQ_GUEST_PRISTINE_LPI: the IRQ is a newly mapped LPI, which > + * has never been in an LR before. This means that any trace of an > + * LPI with the same number in an LR must be from an older LPI, which > + * has been unmapped before. > + * > */ > #define GIC_IRQ_GUEST_QUEUED 0 > #define GIC_IRQ_GUEST_ACTIVE 1 > #define GIC_IRQ_GUEST_VISIBLE 2 > #define GIC_IRQ_GUEST_ENABLED 3 > #define GIC_IRQ_GUEST_MIGRATING 4 > +#define GIC_IRQ_GUEST_PRISTINE_LPI 5 > unsigned long status; > struct irq_desc *desc; /* only set it the irq corresponds to a physical irq */ > unsigned int irq; > -- > 2.9.0 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > https://lists.xen.org/xen-devel >
On Fri, 19 May 2017, Stefano Stabellini wrote: > On Thu, 11 May 2017, Andre Przywara wrote: > > When LPIs get unmapped by a guest, they might still be in some LR of > > some VCPU. Nevertheless we remove the corresponding pending_irq > > (possibly freeing it), and detect this case (irq_to_pending() returns > > NULL) when the LR gets cleaned up later. > > However a *new* LPI may get mapped with the same number while the old > > LPI is *still* in some LR. To avoid getting the wrong state, we mark > > every newly mapped LPI as PRISTINE, which means: has never been in an > > LR before. If we detect the LPI in an LR anyway, it must have been an > > older one, which we can simply retire. > > Before inserting such a PRISTINE LPI into an LR, we must make sure that > > it's not already in another LR, as the architecture forbids two > > interrupts with the same virtual IRQ number on one CPU. > > > > Signed-off-by: Andre Przywara <andre.przywara@arm.com> > > --- > > xen/arch/arm/gic.c | 55 +++++++++++++++++++++++++++++++++++++++++----- > > xen/include/asm-arm/vgic.h | 6 +++++ > > 2 files changed, 56 insertions(+), 5 deletions(-) > > > > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > > index fd3fa05..8bf0578 100644 > > --- a/xen/arch/arm/gic.c > > +++ b/xen/arch/arm/gic.c > > @@ -375,6 +375,8 @@ static inline void gic_set_lr(int lr, struct pending_irq *p, > > { > > ASSERT(!local_irq_is_enabled()); > > > > + clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status); > > + > > gic_hw_ops->update_lr(lr, p, state); > > > > set_bit(GIC_IRQ_GUEST_VISIBLE, &p->status); > > @@ -442,12 +444,41 @@ void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq) > > #endif > > } > > > > +/* > > + * Find an unused LR to insert an IRQ into. If this new interrupt is a > > + * PRISTINE LPI, scan the other LRs to avoid inserting the same IRQ twice. > > + */ > > +static int gic_find_unused_lr(struct vcpu *v, struct pending_irq *p, int lr) > > +{ > > + unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; > > + unsigned long *lr_mask = (unsigned long *) &this_cpu(lr_mask); > > + struct gic_lr lr_val; > > + > > + ASSERT(spin_is_locked(&v->arch.vgic.lock)); > > + > > + if ( test_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) > > Maybe we should add an "unlikely". > > I can see how this would be OKish at runtime, but at boot time there > might be a bunch of PRISTINE_LPIs, but no MAPDs have been issued yet, > right? > > I have a suggestion, I'll leave it to you and Julien if you want to do > this now, or maybe consider it as a TODO item. I am OK either way (I > don't want to delay the ITS any longer). > > I am thinking we should do this scanning only after at least one MAPD > has been issued for a given cpu at least once. I would resurrect the > idea of a DISCARD flag, but not on the pending_irq, that I believe it's > difficult to handle, but a single global DISCARD flag per struct vcpu. > > On MAPD, we set DISCARD for the target vcpu of the LPI we are dropping. > Next time we want to inject a PRISTINE_IRQ on that cpu interface, we > scan all LRs for interrupts with a NULL pending_irq. We remove those > from LRs, then we remove the DISCARD flag. > > Do you think it would work? This would need to be done not only on MAPD but also on DISCARD (and on any other commands that would cause a pending_irq - vLPI mapping to be dropped). > > + { > > + int used_lr = 0; > > + > > + while ( (used_lr = find_next_bit(lr_mask, nr_lrs, used_lr)) < nr_lrs ) > > + { > > + gic_hw_ops->read_lr(used_lr, &lr_val); > > + if ( lr_val.virq == p->irq ) > > + return used_lr; > > + } > > + } > > + > > + lr = find_next_zero_bit(lr_mask, nr_lrs, lr); > > + > > + return lr; > > +} > > + > > void gic_raise_guest_irq(struct vcpu *v, unsigned int virtual_irq, > > unsigned int priority) > > { > > - int i; > > - unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; > > struct pending_irq *p = irq_to_pending(v, virtual_irq); > > + unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; > > + int i = nr_lrs; > > > > ASSERT(spin_is_locked(&v->arch.vgic.lock)); > > > > @@ -457,7 +488,8 @@ void gic_raise_guest_irq(struct vcpu *v, unsigned int virtual_irq, > > > > if ( v == current && list_empty(&v->arch.vgic.lr_pending) ) > > { > > - i = find_first_zero_bit(&this_cpu(lr_mask), nr_lrs); > > + i = gic_find_unused_lr(v, p, 0); > > + > > if (i < nr_lrs) { > > set_bit(i, &this_cpu(lr_mask)); > > gic_set_lr(i, p, GICH_LR_PENDING); > > @@ -509,7 +541,17 @@ static void gic_update_one_lr(struct vcpu *v, int i) > > } > > else if ( lr_val.state & GICH_LR_PENDING ) > > { > > - int q __attribute__ ((unused)) = test_and_clear_bit(GIC_IRQ_GUEST_QUEUED, &p->status); > > + int q __attribute__ ((unused)); > > + > > + if ( test_and_clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) > > + { > > + gic_hw_ops->clear_lr(i); > > + clear_bit(i, &this_cpu(lr_mask)); > > + > > + return; > > + } > > + > > + q = test_and_clear_bit(GIC_IRQ_GUEST_QUEUED, &p->status); > > #ifdef GIC_DEBUG > > if ( q ) > > gdprintk(XENLOG_DEBUG, "trying to inject irq=%d into d%dv%d, when it is already pending in LR%d\n", > > @@ -521,6 +563,9 @@ static void gic_update_one_lr(struct vcpu *v, int i) > > gic_hw_ops->clear_lr(i); > > clear_bit(i, &this_cpu(lr_mask)); > > > > + if ( test_and_clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) > > + return; > > if ( p->desc != NULL ) > > clear_bit(_IRQ_INPROGRESS, &p->desc->status); > > clear_bit(GIC_IRQ_GUEST_VISIBLE, &p->status); > > @@ -591,7 +636,7 @@ static void gic_restore_pending_irqs(struct vcpu *v) > > inflight_r = &v->arch.vgic.inflight_irqs; > > list_for_each_entry_safe ( p, t, &v->arch.vgic.lr_pending, lr_queue ) > > { > > - lr = find_next_zero_bit(&this_cpu(lr_mask), nr_lrs, lr); > > + lr = gic_find_unused_lr(v, p, lr); > > if ( lr >= nr_lrs ) > > { > > /* No more free LRs: find a lower priority irq to evict */ > > diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h > > index 02732db..3fc4ceb 100644 > > --- a/xen/include/asm-arm/vgic.h > > +++ b/xen/include/asm-arm/vgic.h > > @@ -60,12 +60,18 @@ struct pending_irq > > * vcpu while it is still inflight and on an GICH_LR register on the > > * old vcpu. > > * > > + * GIC_IRQ_GUEST_PRISTINE_LPI: the IRQ is a newly mapped LPI, which > > + * has never been in an LR before. This means that any trace of an > > + * LPI with the same number in an LR must be from an older LPI, which > > + * has been unmapped before. > > + * > > */ > > #define GIC_IRQ_GUEST_QUEUED 0 > > #define GIC_IRQ_GUEST_ACTIVE 1 > > #define GIC_IRQ_GUEST_VISIBLE 2 > > #define GIC_IRQ_GUEST_ENABLED 3 > > #define GIC_IRQ_GUEST_MIGRATING 4 > > +#define GIC_IRQ_GUEST_PRISTINE_LPI 5 > > unsigned long status; > > struct irq_desc *desc; /* only set it the irq corresponds to a physical irq */ > > unsigned int irq; > > -- > > 2.9.0 > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xen.org > > https://lists.xen.org/xen-devel > > >
Hi Stefano, On 23/05/17 00:48, Stefano Stabellini wrote: > On Fri, 19 May 2017, Stefano Stabellini wrote: >> On Thu, 11 May 2017, Andre Przywara wrote: >>> When LPIs get unmapped by a guest, they might still be in some LR of >>> some VCPU. Nevertheless we remove the corresponding pending_irq >>> (possibly freeing it), and detect this case (irq_to_pending() returns >>> NULL) when the LR gets cleaned up later. >>> However a *new* LPI may get mapped with the same number while the old >>> LPI is *still* in some LR. To avoid getting the wrong state, we mark >>> every newly mapped LPI as PRISTINE, which means: has never been in an >>> LR before. If we detect the LPI in an LR anyway, it must have been an >>> older one, which we can simply retire. >>> Before inserting such a PRISTINE LPI into an LR, we must make sure that >>> it's not already in another LR, as the architecture forbids two >>> interrupts with the same virtual IRQ number on one CPU. >>> >>> Signed-off-by: Andre Przywara <andre.przywara@arm.com> >>> --- >>> xen/arch/arm/gic.c | 55 +++++++++++++++++++++++++++++++++++++++++----- >>> xen/include/asm-arm/vgic.h | 6 +++++ >>> 2 files changed, 56 insertions(+), 5 deletions(-) >>> >>> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c >>> index fd3fa05..8bf0578 100644 >>> --- a/xen/arch/arm/gic.c >>> +++ b/xen/arch/arm/gic.c >>> @@ -375,6 +375,8 @@ static inline void gic_set_lr(int lr, struct pending_irq *p, >>> { >>> ASSERT(!local_irq_is_enabled()); >>> >>> + clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status); >>> + >>> gic_hw_ops->update_lr(lr, p, state); >>> >>> set_bit(GIC_IRQ_GUEST_VISIBLE, &p->status); >>> @@ -442,12 +444,41 @@ void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq) >>> #endif >>> } >>> >>> +/* >>> + * Find an unused LR to insert an IRQ into. If this new interrupt is a >>> + * PRISTINE LPI, scan the other LRs to avoid inserting the same IRQ twice. >>> + */ >>> +static int gic_find_unused_lr(struct vcpu *v, struct pending_irq *p, int lr) >>> +{ >>> + unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; >>> + unsigned long *lr_mask = (unsigned long *) &this_cpu(lr_mask); >>> + struct gic_lr lr_val; >>> + >>> + ASSERT(spin_is_locked(&v->arch.vgic.lock)); >>> + >>> + if ( test_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) >> >> Maybe we should add an "unlikely". >> >> I can see how this would be OKish at runtime, but at boot time there >> might be a bunch of PRISTINE_LPIs, but no MAPDs have been issued yet, >> right? You cannot have any PRISTINE_LPIs without any MAPDs done. This bit will be set when you do the first MAPTI. >> >> I have a suggestion, I'll leave it to you and Julien if you want to do >> this now, or maybe consider it as a TODO item. I am OK either way (I >> don't want to delay the ITS any longer). >> >> I am thinking we should do this scanning only after at least one MAPD >> has been issued for a given cpu at least once. I would resurrect the >> idea of a DISCARD flag, but not on the pending_irq, that I believe it's >> difficult to handle, but a single global DISCARD flag per struct vcpu. >> >> On MAPD, we set DISCARD for the target vcpu of the LPI we are dropping. >> Next time we want to inject a PRISTINE_IRQ on that cpu interface, we >> scan all LRs for interrupts with a NULL pending_irq. We remove those >> from LRs, then we remove the DISCARD flag. >> >> Do you think it would work? I don't understand the point to do that. Ok, you will get the first PRISTINE_LPI "fast" (though likely LRs will be empty), but all the other will be "slow" (though likely LRs will be empty too). The pain to implement your suggestion does not seem to be worth it so far. Cheers,
Hi Stefano, On 20/05/17 02:25, Stefano Stabellini wrote: > On Thu, 11 May 2017, Andre Przywara wrote: >> When LPIs get unmapped by a guest, they might still be in some LR of >> some VCPU. Nevertheless we remove the corresponding pending_irq >> (possibly freeing it), and detect this case (irq_to_pending() returns >> NULL) when the LR gets cleaned up later. >> However a *new* LPI may get mapped with the same number while the old >> LPI is *still* in some LR. To avoid getting the wrong state, we mark >> every newly mapped LPI as PRISTINE, which means: has never been in an >> LR before. If we detect the LPI in an LR anyway, it must have been an >> older one, which we can simply retire. >> Before inserting such a PRISTINE LPI into an LR, we must make sure that >> it's not already in another LR, as the architecture forbids two >> interrupts with the same virtual IRQ number on one CPU. >> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com> >> --- >> xen/arch/arm/gic.c | 55 +++++++++++++++++++++++++++++++++++++++++----- >> xen/include/asm-arm/vgic.h | 6 +++++ >> 2 files changed, 56 insertions(+), 5 deletions(-) >> >> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c >> index fd3fa05..8bf0578 100644 >> --- a/xen/arch/arm/gic.c >> +++ b/xen/arch/arm/gic.c >> @@ -375,6 +375,8 @@ static inline void gic_set_lr(int lr, struct pending_irq *p, >> { >> ASSERT(!local_irq_is_enabled()); >> >> + clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status); >> + >> gic_hw_ops->update_lr(lr, p, state); >> >> set_bit(GIC_IRQ_GUEST_VISIBLE, &p->status); >> @@ -442,12 +444,41 @@ void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq) >> #endif >> } >> >> +/* >> + * Find an unused LR to insert an IRQ into. If this new interrupt is a >> + * PRISTINE LPI, scan the other LRs to avoid inserting the same IRQ twice. >> + */ >> +static int gic_find_unused_lr(struct vcpu *v, struct pending_irq *p, int lr) >> +{ >> + unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; >> + unsigned long *lr_mask = (unsigned long *) &this_cpu(lr_mask); >> + struct gic_lr lr_val; >> + >> + ASSERT(spin_is_locked(&v->arch.vgic.lock)); >> + >> + if ( test_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) > > Maybe we should add an "unlikely". > > I can see how this would be OKish at runtime, but at boot time there > might be a bunch of PRISTINE_LPIs, What is your concern here, performance? Let's put this into perspective: - The PRISTINE bit gets set upon MAPTI, which Linux usually does *once* when the driver gets loaded. It gets cleared after the first injection. - If that happens, we scan all LRs. Most implementations have 4(!) of them (ARM GIC implementations, for instance), also the algorithm only scans *used* LRs, so normally just one or two. - Reading the LR is a *local* system register *read*, not an MMIO access, and not propagated to other cores. Yes, this may be "costly" (compared to other instructions), but it's probably still cheaper than a page table walk (TLB miss) or L2 cache miss. So to summarize: this is rare, iterates over only a very small number of registers and is not hugely expensive. At this point in time I would refrain from any kind of performance optimization, at least unless we have solved all the other issues and have done some benchmarking/profiling (on different hardware platforms). > but no MAPDs have been issued yet, right? As Julien already mentioned, this gets set after a MAPTI, which requires a MAPD before. Cheers, Andre. > I have a suggestion, I'll leave it to you and Julien if you want to do > this now, or maybe consider it as a TODO item. I am OK either way (I > don't want to delay the ITS any longer). > > I am thinking we should do this scanning only after at least one MAPD > has been issued for a given cpu at least once. I would resurrect the > idea of a DISCARD flag, but not on the pending_irq, that I believe it's > difficult to handle, but a single global DISCARD flag per struct vcpu. > > On MAPD, we set DISCARD for the target vcpu of the LPI we are dropping. > Next time we want to inject a PRISTINE_IRQ on that cpu interface, we > scan all LRs for interrupts with a NULL pending_irq. We remove those > from LRs, then we remove the DISCARD flag. > > Do you think it would work? > > >> + { >> + int used_lr = 0; >> + >> + while ( (used_lr = find_next_bit(lr_mask, nr_lrs, used_lr)) < nr_lrs ) >> + { >> + gic_hw_ops->read_lr(used_lr, &lr_val); >> + if ( lr_val.virq == p->irq ) >> + return used_lr; >> + } >> + } >> + >> + lr = find_next_zero_bit(lr_mask, nr_lrs, lr); >> + >> + return lr; >> +} >> + >> void gic_raise_guest_irq(struct vcpu *v, unsigned int virtual_irq, >> unsigned int priority) >> { >> - int i; >> - unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; >> struct pending_irq *p = irq_to_pending(v, virtual_irq); >> + unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; >> + int i = nr_lrs; >> >> ASSERT(spin_is_locked(&v->arch.vgic.lock)); >> >> @@ -457,7 +488,8 @@ void gic_raise_guest_irq(struct vcpu *v, unsigned int virtual_irq, >> >> if ( v == current && list_empty(&v->arch.vgic.lr_pending) ) >> { >> - i = find_first_zero_bit(&this_cpu(lr_mask), nr_lrs); >> + i = gic_find_unused_lr(v, p, 0); >> + >> if (i < nr_lrs) { >> set_bit(i, &this_cpu(lr_mask)); >> gic_set_lr(i, p, GICH_LR_PENDING); >> @@ -509,7 +541,17 @@ static void gic_update_one_lr(struct vcpu *v, int i) >> } >> else if ( lr_val.state & GICH_LR_PENDING ) >> { >> - int q __attribute__ ((unused)) = test_and_clear_bit(GIC_IRQ_GUEST_QUEUED, &p->status); >> + int q __attribute__ ((unused)); >> + >> + if ( test_and_clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) >> + { >> + gic_hw_ops->clear_lr(i); >> + clear_bit(i, &this_cpu(lr_mask)); >> + >> + return; >> + } >> + >> + q = test_and_clear_bit(GIC_IRQ_GUEST_QUEUED, &p->status); >> #ifdef GIC_DEBUG >> if ( q ) >> gdprintk(XENLOG_DEBUG, "trying to inject irq=%d into d%dv%d, when it is already pending in LR%d\n", >> @@ -521,6 +563,9 @@ static void gic_update_one_lr(struct vcpu *v, int i) >> gic_hw_ops->clear_lr(i); >> clear_bit(i, &this_cpu(lr_mask)); >> >> + if ( test_and_clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) >> + return; >> if ( p->desc != NULL ) >> clear_bit(_IRQ_INPROGRESS, &p->desc->status); >> clear_bit(GIC_IRQ_GUEST_VISIBLE, &p->status); >> @@ -591,7 +636,7 @@ static void gic_restore_pending_irqs(struct vcpu *v) >> inflight_r = &v->arch.vgic.inflight_irqs; >> list_for_each_entry_safe ( p, t, &v->arch.vgic.lr_pending, lr_queue ) >> { >> - lr = find_next_zero_bit(&this_cpu(lr_mask), nr_lrs, lr); >> + lr = gic_find_unused_lr(v, p, lr); >> if ( lr >= nr_lrs ) >> { >> /* No more free LRs: find a lower priority irq to evict */ >> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h >> index 02732db..3fc4ceb 100644 >> --- a/xen/include/asm-arm/vgic.h >> +++ b/xen/include/asm-arm/vgic.h >> @@ -60,12 +60,18 @@ struct pending_irq >> * vcpu while it is still inflight and on an GICH_LR register on the >> * old vcpu. >> * >> + * GIC_IRQ_GUEST_PRISTINE_LPI: the IRQ is a newly mapped LPI, which >> + * has never been in an LR before. This means that any trace of an >> + * LPI with the same number in an LR must be from an older LPI, which >> + * has been unmapped before. >> + * >> */ >> #define GIC_IRQ_GUEST_QUEUED 0 >> #define GIC_IRQ_GUEST_ACTIVE 1 >> #define GIC_IRQ_GUEST_VISIBLE 2 >> #define GIC_IRQ_GUEST_ENABLED 3 >> #define GIC_IRQ_GUEST_MIGRATING 4 >> +#define GIC_IRQ_GUEST_PRISTINE_LPI 5 >> unsigned long status; >> struct irq_desc *desc; /* only set it the irq corresponds to a physical irq */ >> unsigned int irq; >> -- >> 2.9.0 >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> https://lists.xen.org/xen-devel >>
On Tue, 23 May 2017, Julien Grall wrote: > Hi Stefano, > > On 23/05/17 00:48, Stefano Stabellini wrote: > > On Fri, 19 May 2017, Stefano Stabellini wrote: > > > On Thu, 11 May 2017, Andre Przywara wrote: > > > > When LPIs get unmapped by a guest, they might still be in some LR of > > > > some VCPU. Nevertheless we remove the corresponding pending_irq > > > > (possibly freeing it), and detect this case (irq_to_pending() returns > > > > NULL) when the LR gets cleaned up later. > > > > However a *new* LPI may get mapped with the same number while the old > > > > LPI is *still* in some LR. To avoid getting the wrong state, we mark > > > > every newly mapped LPI as PRISTINE, which means: has never been in an > > > > LR before. If we detect the LPI in an LR anyway, it must have been an > > > > older one, which we can simply retire. > > > > Before inserting such a PRISTINE LPI into an LR, we must make sure that > > > > it's not already in another LR, as the architecture forbids two > > > > interrupts with the same virtual IRQ number on one CPU. > > > > > > > > Signed-off-by: Andre Przywara <andre.przywara@arm.com> > > > > --- > > > > xen/arch/arm/gic.c | 55 > > > > +++++++++++++++++++++++++++++++++++++++++----- > > > > xen/include/asm-arm/vgic.h | 6 +++++ > > > > 2 files changed, 56 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > > > > index fd3fa05..8bf0578 100644 > > > > --- a/xen/arch/arm/gic.c > > > > +++ b/xen/arch/arm/gic.c > > > > @@ -375,6 +375,8 @@ static inline void gic_set_lr(int lr, struct > > > > pending_irq *p, > > > > { > > > > ASSERT(!local_irq_is_enabled()); > > > > > > > > + clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status); > > > > + > > > > gic_hw_ops->update_lr(lr, p, state); > > > > > > > > set_bit(GIC_IRQ_GUEST_VISIBLE, &p->status); > > > > @@ -442,12 +444,41 @@ void gic_raise_inflight_irq(struct vcpu *v, > > > > unsigned int virtual_irq) > > > > #endif > > > > } > > > > > > > > +/* > > > > + * Find an unused LR to insert an IRQ into. If this new interrupt is a > > > > + * PRISTINE LPI, scan the other LRs to avoid inserting the same IRQ > > > > twice. > > > > + */ > > > > +static int gic_find_unused_lr(struct vcpu *v, struct pending_irq *p, > > > > int lr) > > > > +{ > > > > + unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; > > > > + unsigned long *lr_mask = (unsigned long *) &this_cpu(lr_mask); > > > > + struct gic_lr lr_val; > > > > + > > > > + ASSERT(spin_is_locked(&v->arch.vgic.lock)); > > > > + > > > > + if ( test_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) > > > > > > Maybe we should add an "unlikely". > > > > > > I can see how this would be OKish at runtime, but at boot time there > > > might be a bunch of PRISTINE_LPIs, but no MAPDs have been issued yet, > > > right? > > You cannot have any PRISTINE_LPIs without any MAPDs done. This bit will be set > when you do the first MAPTI. > > > > > > > I have a suggestion, I'll leave it to you and Julien if you want to do > > > this now, or maybe consider it as a TODO item. I am OK either way (I > > > don't want to delay the ITS any longer). > > > > > > I am thinking we should do this scanning only after at least one MAPD > > > has been issued for a given cpu at least once. I would resurrect the > > > idea of a DISCARD flag, but not on the pending_irq, that I believe it's > > > difficult to handle, but a single global DISCARD flag per struct vcpu. > > > > > > On MAPD, we set DISCARD for the target vcpu of the LPI we are dropping. > > > Next time we want to inject a PRISTINE_IRQ on that cpu interface, we > > > scan all LRs for interrupts with a NULL pending_irq. We remove those > > > from LRs, then we remove the DISCARD flag. > > > > > > Do you think it would work? > > I don't understand the point to do that. Ok, you will get the first > PRISTINE_LPI "fast" (though likely LRs will be empty), but all the other will > be "slow" (though likely LRs will be empty too). > > The pain to implement your suggestion does not seem to be worth it so far. Let me explain it a bit better, I think I didn't clarify it well enough. Let me also premise, that this would be fine to do later, it doesn't have to be part of this patch. When I wrote MAPD above, I meant actually any commands that delete an existing pending_irq - vLPI mapping. Specifically, DISCARD, and MAPD when the if ( !valid ) /* Discard all events and remove pending LPIs. */ its_unmap_device(its, devid); code path is taken, which should not be the case at boot time, right? Are there any other commands that remove a pending_irq - vLPI mapping that I missed? The idea is that we could add a VGIC_V3_LPIS_DISCARD flag to arch_vcpu. VGIC_V3_LPIS_DISCARD is set on a DISCARD command, and on a MAPD (!valid) command. If VGIC_V3_LPIS_DISCARD is not set, there is no need to scan anything. If VGIC_V3_LPIS_DISCARD is set *and* we want to inject a PRISTINE_IRQ, then we do the scanning. When we do the scanning, we check all LRs for NULL pending_irq structs. We clear them all in one go. Then we remove VGIC_V3_LPIS_DISCARD. This way, we get all PRISTINE_LPI fast, except for the very first one after a DISCARD or MAPD (!valid) command. Does it make more sense now? What do you think?
Hi Stefano, On 05/23/2017 07:23 PM, Stefano Stabellini wrote: > On Tue, 23 May 2017, Julien Grall wrote: >> Hi Stefano, >> >> On 23/05/17 00:48, Stefano Stabellini wrote: >>> On Fri, 19 May 2017, Stefano Stabellini wrote: >>>> On Thu, 11 May 2017, Andre Przywara wrote: >>>>> When LPIs get unmapped by a guest, they might still be in some LR of >>>>> some VCPU. Nevertheless we remove the corresponding pending_irq >>>>> (possibly freeing it), and detect this case (irq_to_pending() returns >>>>> NULL) when the LR gets cleaned up later. >>>>> However a *new* LPI may get mapped with the same number while the old >>>>> LPI is *still* in some LR. To avoid getting the wrong state, we mark >>>>> every newly mapped LPI as PRISTINE, which means: has never been in an >>>>> LR before. If we detect the LPI in an LR anyway, it must have been an >>>>> older one, which we can simply retire. >>>>> Before inserting such a PRISTINE LPI into an LR, we must make sure that >>>>> it's not already in another LR, as the architecture forbids two >>>>> interrupts with the same virtual IRQ number on one CPU. >>>>> >>>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com> >>>>> --- >>>>> xen/arch/arm/gic.c | 55 >>>>> +++++++++++++++++++++++++++++++++++++++++----- >>>>> xen/include/asm-arm/vgic.h | 6 +++++ >>>>> 2 files changed, 56 insertions(+), 5 deletions(-) >>>>> >>>>> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c >>>>> index fd3fa05..8bf0578 100644 >>>>> --- a/xen/arch/arm/gic.c >>>>> +++ b/xen/arch/arm/gic.c >>>>> @@ -375,6 +375,8 @@ static inline void gic_set_lr(int lr, struct >>>>> pending_irq *p, >>>>> { >>>>> ASSERT(!local_irq_is_enabled()); >>>>> >>>>> + clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status); >>>>> + >>>>> gic_hw_ops->update_lr(lr, p, state); >>>>> >>>>> set_bit(GIC_IRQ_GUEST_VISIBLE, &p->status); >>>>> @@ -442,12 +444,41 @@ void gic_raise_inflight_irq(struct vcpu *v, >>>>> unsigned int virtual_irq) >>>>> #endif >>>>> } >>>>> >>>>> +/* >>>>> + * Find an unused LR to insert an IRQ into. If this new interrupt is a >>>>> + * PRISTINE LPI, scan the other LRs to avoid inserting the same IRQ >>>>> twice. >>>>> + */ >>>>> +static int gic_find_unused_lr(struct vcpu *v, struct pending_irq *p, >>>>> int lr) >>>>> +{ >>>>> + unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; >>>>> + unsigned long *lr_mask = (unsigned long *) &this_cpu(lr_mask); >>>>> + struct gic_lr lr_val; >>>>> + >>>>> + ASSERT(spin_is_locked(&v->arch.vgic.lock)); >>>>> + >>>>> + if ( test_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) >>>> >>>> Maybe we should add an "unlikely". >>>> >>>> I can see how this would be OKish at runtime, but at boot time there >>>> might be a bunch of PRISTINE_LPIs, but no MAPDs have been issued yet, >>>> right? >> >> You cannot have any PRISTINE_LPIs without any MAPDs done. This bit will be set >> when you do the first MAPTI. >> >>>> >>>> I have a suggestion, I'll leave it to you and Julien if you want to do >>>> this now, or maybe consider it as a TODO item. I am OK either way (I >>>> don't want to delay the ITS any longer). >>>> >>>> I am thinking we should do this scanning only after at least one MAPD >>>> has been issued for a given cpu at least once. I would resurrect the >>>> idea of a DISCARD flag, but not on the pending_irq, that I believe it's >>>> difficult to handle, but a single global DISCARD flag per struct vcpu. >>>> >>>> On MAPD, we set DISCARD for the target vcpu of the LPI we are dropping. >>>> Next time we want to inject a PRISTINE_IRQ on that cpu interface, we >>>> scan all LRs for interrupts with a NULL pending_irq. We remove those >>>> from LRs, then we remove the DISCARD flag. >>>> >>>> Do you think it would work? >> >> I don't understand the point to do that. Ok, you will get the first >> PRISTINE_LPI "fast" (though likely LRs will be empty), but all the other will >> be "slow" (though likely LRs will be empty too). >> >> The pain to implement your suggestion does not seem to be worth it so far. > > Let me explain it a bit better, I think I didn't clarify it well enough. > Let me also premise, that this would be fine to do later, it doesn't > have to be part of this patch. > > When I wrote MAPD above, I meant actually any commands that delete an > existing pending_irq - vLPI mapping. Specifically, DISCARD, and MAPD > when the > > if ( !valid ) > /* Discard all events and remove pending LPIs. */ > its_unmap_device(its, devid); > > code path is taken, which should not be the case at boot time, right? > Are there any other commands that remove a pending_irq - vLPI mapping > that I missed? I don't think so. > > The idea is that we could add a VGIC_V3_LPIS_DISCARD flag to arch_vcpu. > VGIC_V3_LPIS_DISCARD is set on a DISCARD command, and on a MAPD (!valid) > command. If VGIC_V3_LPIS_DISCARD is not set, there is no need to scan > anything. If VGIC_V3_LPIS_DISCARD is set *and* we want to inject a > PRISTINE_IRQ, then we do the scanning. > > When we do the scanning, we check all LRs for NULL pending_irq structs. > We clear them all in one go. Then we remove VGIC_V3_LPIS_DISCARD. The problem we are trying to solve here is not because of NULL pending_irq structs. It is because of the previous interrupt may still be in LRs so we would end-up to have twice the LPIs in it. This will result to unpredictable behavior. This could happen if you do: vCPU A | vCPU B | DISCARD vLPI1 | MAPTI vLPI1 | interrupt injected on vCPU B | entering in the hyp | clear_lrs | vgic_vcpu_inject_irq clear_lrs will not remove the interrupt from LRs if it was already pending because pending_irq is not NULL anymore. This is causing issue because we are trying to be clever with the LRs by not regenerating them at every entry/exit. This is causing trouble in many more place in the vGIC. IHMO we should attempt to regenerate them and see how this will affect the performance. > > This way, we get all PRISTINE_LPI fast, except for the very first one > after a DISCARD or MAPD (!valid) command. > > Does it make more sense now? What do you think? To be honest, I think we are trying to think about premature optimization without any number. We should first look at the vGIC rework and then see if this code will stay in place (I have the feeling it will disappear). Cheers,
On Wed, 24 May 2017, Julien Grall wrote: > Hi Stefano, > > On 05/23/2017 07:23 PM, Stefano Stabellini wrote: > > On Tue, 23 May 2017, Julien Grall wrote: > > > Hi Stefano, > > > > > > On 23/05/17 00:48, Stefano Stabellini wrote: > > > > On Fri, 19 May 2017, Stefano Stabellini wrote: > > > > > On Thu, 11 May 2017, Andre Przywara wrote: > > > > > > When LPIs get unmapped by a guest, they might still be in some LR of > > > > > > some VCPU. Nevertheless we remove the corresponding pending_irq > > > > > > (possibly freeing it), and detect this case (irq_to_pending() > > > > > > returns > > > > > > NULL) when the LR gets cleaned up later. > > > > > > However a *new* LPI may get mapped with the same number while the > > > > > > old > > > > > > LPI is *still* in some LR. To avoid getting the wrong state, we mark > > > > > > every newly mapped LPI as PRISTINE, which means: has never been in > > > > > > an > > > > > > LR before. If we detect the LPI in an LR anyway, it must have been > > > > > > an > > > > > > older one, which we can simply retire. > > > > > > Before inserting such a PRISTINE LPI into an LR, we must make sure > > > > > > that > > > > > > it's not already in another LR, as the architecture forbids two > > > > > > interrupts with the same virtual IRQ number on one CPU. > > > > > > > > > > > > Signed-off-by: Andre Przywara <andre.przywara@arm.com> > > > > > > --- > > > > > > xen/arch/arm/gic.c | 55 > > > > > > +++++++++++++++++++++++++++++++++++++++++----- > > > > > > xen/include/asm-arm/vgic.h | 6 +++++ > > > > > > 2 files changed, 56 insertions(+), 5 deletions(-) > > > > > > > > > > > > diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c > > > > > > index fd3fa05..8bf0578 100644 > > > > > > --- a/xen/arch/arm/gic.c > > > > > > +++ b/xen/arch/arm/gic.c > > > > > > @@ -375,6 +375,8 @@ static inline void gic_set_lr(int lr, struct > > > > > > pending_irq *p, > > > > > > { > > > > > > ASSERT(!local_irq_is_enabled()); > > > > > > > > > > > > + clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status); > > > > > > + > > > > > > gic_hw_ops->update_lr(lr, p, state); > > > > > > > > > > > > set_bit(GIC_IRQ_GUEST_VISIBLE, &p->status); > > > > > > @@ -442,12 +444,41 @@ void gic_raise_inflight_irq(struct vcpu *v, > > > > > > unsigned int virtual_irq) > > > > > > #endif > > > > > > } > > > > > > > > > > > > +/* > > > > > > + * Find an unused LR to insert an IRQ into. If this new interrupt > > > > > > is a > > > > > > + * PRISTINE LPI, scan the other LRs to avoid inserting the same IRQ > > > > > > twice. > > > > > > + */ > > > > > > +static int gic_find_unused_lr(struct vcpu *v, struct pending_irq > > > > > > *p, > > > > > > int lr) > > > > > > +{ > > > > > > + unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; > > > > > > + unsigned long *lr_mask = (unsigned long *) &this_cpu(lr_mask); > > > > > > + struct gic_lr lr_val; > > > > > > + > > > > > > + ASSERT(spin_is_locked(&v->arch.vgic.lock)); > > > > > > + > > > > > > + if ( test_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) > > > > > > > > > > Maybe we should add an "unlikely". > > > > > > > > > > I can see how this would be OKish at runtime, but at boot time there > > > > > might be a bunch of PRISTINE_LPIs, but no MAPDs have been issued yet, > > > > > right? > > > > > > You cannot have any PRISTINE_LPIs without any MAPDs done. This bit will be > > > set > > > when you do the first MAPTI. > > > > > > > > > > > > > I have a suggestion, I'll leave it to you and Julien if you want to do > > > > > this now, or maybe consider it as a TODO item. I am OK either way (I > > > > > don't want to delay the ITS any longer). > > > > > > > > > > I am thinking we should do this scanning only after at least one MAPD > > > > > has been issued for a given cpu at least once. I would resurrect the > > > > > idea of a DISCARD flag, but not on the pending_irq, that I believe > > > > > it's > > > > > difficult to handle, but a single global DISCARD flag per struct vcpu. > > > > > > > > > > On MAPD, we set DISCARD for the target vcpu of the LPI we are > > > > > dropping. > > > > > Next time we want to inject a PRISTINE_IRQ on that cpu interface, we > > > > > scan all LRs for interrupts with a NULL pending_irq. We remove those > > > > > from LRs, then we remove the DISCARD flag. > > > > > > > > > > Do you think it would work? > > > > > > I don't understand the point to do that. Ok, you will get the first > > > PRISTINE_LPI "fast" (though likely LRs will be empty), but all the other > > > will > > > be "slow" (though likely LRs will be empty too). > > > > > > The pain to implement your suggestion does not seem to be worth it so far. > > > > Let me explain it a bit better, I think I didn't clarify it well enough. > > Let me also premise, that this would be fine to do later, it doesn't > > have to be part of this patch. > > > > When I wrote MAPD above, I meant actually any commands that delete an > > existing pending_irq - vLPI mapping. Specifically, DISCARD, and MAPD > > when the > > > > if ( !valid ) > > /* Discard all events and remove pending LPIs. */ > > its_unmap_device(its, devid); > > > > code path is taken, which should not be the case at boot time, right? > > Are there any other commands that remove a pending_irq - vLPI mapping > > that I missed? > > I don't think so. > > > > > The idea is that we could add a VGIC_V3_LPIS_DISCARD flag to arch_vcpu. > > VGIC_V3_LPIS_DISCARD is set on a DISCARD command, and on a MAPD (!valid) > > command. If VGIC_V3_LPIS_DISCARD is not set, there is no need to scan > > anything. If VGIC_V3_LPIS_DISCARD is set *and* we want to inject a > > PRISTINE_IRQ, then we do the scanning. > > > > When we do the scanning, we check all LRs for NULL pending_irq structs. > > We clear them all in one go. Then we remove VGIC_V3_LPIS_DISCARD. > > The problem we are trying to solve here is not because of NULL pending_irq > structs. It is because of the previous interrupt may still be in LRs so we > would end-up to have twice the LPIs in it. This will result to unpredictable > behavior. > > This could happen if you do: > > vCPU A | vCPU B > | > DISCARD vLPI1 | > MAPTI vLPI1 | > > interrupt injected on vCPU B > > | entering in the hyp > | clear_lrs > | vgic_vcpu_inject_irq > > > clear_lrs will not remove the interrupt from LRs if it was already pending > because pending_irq is not NULL anymore. > > This is causing issue because we are trying to be clever with the LRs by not > regenerating them at every entry/exit. This is causing trouble in many more > place in the vGIC. IHMO we should attempt to regenerate them and see how this > will affect the performance. Yes but if pending_irq is not NULL, then it will be marked as PRISTINE, so it is still recognizable. We can still "clean it up". > > This way, we get all PRISTINE_LPI fast, except for the very first one > > after a DISCARD or MAPD (!valid) command. > > > > Does it make more sense now? What do you think? > > To be honest, I think we are trying to think about premature optimization > without any number. We should first look at the vGIC rework and then see if > this code will stay in place (I have the feeling it will disappear). OK, no problem. Let's revisit this in the future.
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index fd3fa05..8bf0578 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -375,6 +375,8 @@ static inline void gic_set_lr(int lr, struct pending_irq *p, { ASSERT(!local_irq_is_enabled()); + clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status); + gic_hw_ops->update_lr(lr, p, state); set_bit(GIC_IRQ_GUEST_VISIBLE, &p->status); @@ -442,12 +444,41 @@ void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq) #endif } +/* + * Find an unused LR to insert an IRQ into. If this new interrupt is a + * PRISTINE LPI, scan the other LRs to avoid inserting the same IRQ twice. + */ +static int gic_find_unused_lr(struct vcpu *v, struct pending_irq *p, int lr) +{ + unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; + unsigned long *lr_mask = (unsigned long *) &this_cpu(lr_mask); + struct gic_lr lr_val; + + ASSERT(spin_is_locked(&v->arch.vgic.lock)); + + if ( test_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) + { + int used_lr = 0; + + while ( (used_lr = find_next_bit(lr_mask, nr_lrs, used_lr)) < nr_lrs ) + { + gic_hw_ops->read_lr(used_lr, &lr_val); + if ( lr_val.virq == p->irq ) + return used_lr; + } + } + + lr = find_next_zero_bit(lr_mask, nr_lrs, lr); + + return lr; +} + void gic_raise_guest_irq(struct vcpu *v, unsigned int virtual_irq, unsigned int priority) { - int i; - unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; struct pending_irq *p = irq_to_pending(v, virtual_irq); + unsigned int nr_lrs = gic_hw_ops->info->nr_lrs; + int i = nr_lrs; ASSERT(spin_is_locked(&v->arch.vgic.lock)); @@ -457,7 +488,8 @@ void gic_raise_guest_irq(struct vcpu *v, unsigned int virtual_irq, if ( v == current && list_empty(&v->arch.vgic.lr_pending) ) { - i = find_first_zero_bit(&this_cpu(lr_mask), nr_lrs); + i = gic_find_unused_lr(v, p, 0); + if (i < nr_lrs) { set_bit(i, &this_cpu(lr_mask)); gic_set_lr(i, p, GICH_LR_PENDING); @@ -509,7 +541,17 @@ static void gic_update_one_lr(struct vcpu *v, int i) } else if ( lr_val.state & GICH_LR_PENDING ) { - int q __attribute__ ((unused)) = test_and_clear_bit(GIC_IRQ_GUEST_QUEUED, &p->status); + int q __attribute__ ((unused)); + + if ( test_and_clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) + { + gic_hw_ops->clear_lr(i); + clear_bit(i, &this_cpu(lr_mask)); + + return; + } + + q = test_and_clear_bit(GIC_IRQ_GUEST_QUEUED, &p->status); #ifdef GIC_DEBUG if ( q ) gdprintk(XENLOG_DEBUG, "trying to inject irq=%d into d%dv%d, when it is already pending in LR%d\n", @@ -521,6 +563,9 @@ static void gic_update_one_lr(struct vcpu *v, int i) gic_hw_ops->clear_lr(i); clear_bit(i, &this_cpu(lr_mask)); + if ( test_and_clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, &p->status) ) + return; + if ( p->desc != NULL ) clear_bit(_IRQ_INPROGRESS, &p->desc->status); clear_bit(GIC_IRQ_GUEST_VISIBLE, &p->status); @@ -591,7 +636,7 @@ static void gic_restore_pending_irqs(struct vcpu *v) inflight_r = &v->arch.vgic.inflight_irqs; list_for_each_entry_safe ( p, t, &v->arch.vgic.lr_pending, lr_queue ) { - lr = find_next_zero_bit(&this_cpu(lr_mask), nr_lrs, lr); + lr = gic_find_unused_lr(v, p, lr); if ( lr >= nr_lrs ) { /* No more free LRs: find a lower priority irq to evict */ diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h index 02732db..3fc4ceb 100644 --- a/xen/include/asm-arm/vgic.h +++ b/xen/include/asm-arm/vgic.h @@ -60,12 +60,18 @@ struct pending_irq * vcpu while it is still inflight and on an GICH_LR register on the * old vcpu. * + * GIC_IRQ_GUEST_PRISTINE_LPI: the IRQ is a newly mapped LPI, which + * has never been in an LR before. This means that any trace of an + * LPI with the same number in an LR must be from an older LPI, which + * has been unmapped before. + * */ #define GIC_IRQ_GUEST_QUEUED 0 #define GIC_IRQ_GUEST_ACTIVE 1 #define GIC_IRQ_GUEST_VISIBLE 2 #define GIC_IRQ_GUEST_ENABLED 3 #define GIC_IRQ_GUEST_MIGRATING 4 +#define GIC_IRQ_GUEST_PRISTINE_LPI 5 unsigned long status; struct irq_desc *desc; /* only set it the irq corresponds to a physical irq */ unsigned int irq;
When LPIs get unmapped by a guest, they might still be in some LR of some VCPU. Nevertheless we remove the corresponding pending_irq (possibly freeing it), and detect this case (irq_to_pending() returns NULL) when the LR gets cleaned up later. However a *new* LPI may get mapped with the same number while the old LPI is *still* in some LR. To avoid getting the wrong state, we mark every newly mapped LPI as PRISTINE, which means: has never been in an LR before. If we detect the LPI in an LR anyway, it must have been an older one, which we can simply retire. Before inserting such a PRISTINE LPI into an LR, we must make sure that it's not already in another LR, as the architecture forbids two interrupts with the same virtual IRQ number on one CPU. Signed-off-by: Andre Przywara <andre.przywara@arm.com> --- xen/arch/arm/gic.c | 55 +++++++++++++++++++++++++++++++++++++++++----- xen/include/asm-arm/vgic.h | 6 +++++ 2 files changed, 56 insertions(+), 5 deletions(-)