diff mbox

[v6,10/36] ARM: GIC: Add checks for NULL pointer pending_irq's

Message ID 20170407173307.9788-11-andre.przywara@arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Andre Przywara April 7, 2017, 5:32 p.m. UTC
For LPIs the struct pending_irq's are somewhat dynamically allocated and
the pointers are stored in a radix tree. While I convinced myself that
an invalid LPI number can't make it into the core code, people might be
concerned about NULL pointer dereferences.
So add checks in some places just to be on the safe side.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic.c  | 23 +++++++++++++++++++++++
 xen/arch/arm/vgic.c |  4 ++++
 2 files changed, 27 insertions(+)

Comments

Julien Grall April 7, 2017, 6:32 p.m. UTC | #1
Hi Andre,

On 07/04/17 18:32, Andre Przywara wrote:
> For LPIs the struct pending_irq's are somewhat dynamically allocated and

When I read "dynamically", I directly ask myself. What is protecting the 
pending_irq structure to be freed whilst in-use in the vgic code?

In the current design, this would happen if a device is removed from a 
domain (e.g via the MAPD command for DOM0).

I cannot find any code which would prevent that and I think DOM0 can 
take down Xen by mistake. Imagine a pending IRQ whilst executing MAPD(V=0).

So what's the plan?

> the pointers are stored in a radix tree. While I convinced myself that
> an invalid LPI number can't make it into the core code, people might be
> concerned about NULL pointer dereferences.
> So add checks in some places just to be on the safe side.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic.c  | 23 +++++++++++++++++++++++
>  xen/arch/arm/vgic.c |  4 ++++
>  2 files changed, 27 insertions(+)
>
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index da19130..44c34b1 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -405,6 +405,13 @@ void gic_remove_from_queues(struct vcpu *v, unsigned int virtual_irq)
>      struct pending_irq *p = irq_to_pending(v, virtual_irq);
>      unsigned long flags;
>
> +    /*
> +     * If an LPIs has been removed meanwhile, it has been cleaned up
> +     * already, so nothing to remove here.
> +     */
> +    if ( !p )
> +        return;
> +
>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
>      if ( !list_empty(&p->lr_queue) )
>          list_del_init(&p->lr_queue);
> @@ -415,6 +422,10 @@ void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq)
>  {
>      struct pending_irq *n = irq_to_pending(v, virtual_irq);
>
> +    /* If an LPI has been removed meanwhile, there is nothing left to raise. */
> +    if ( !n )
> +        return;
> +
>      ASSERT(spin_is_locked(&v->arch.vgic.lock));
>
>      if ( list_empty(&n->lr_queue) )
> @@ -461,7 +472,19 @@ static void gic_update_one_lr(struct vcpu *v, int i)
>
>      gic_hw_ops->read_lr(i, &lr_val);
>      irq = lr_val.virq;
> +

Please don't make spurious change...

>      p = irq_to_pending(v, irq);
> +    /* An LPI might have been unmapped, in which case we just clean up here. */
> +    if ( !p )
> +    {
> +        ASSERT(is_lpi(irq));
> +
> +        gic_hw_ops->clear_lr(i);
> +        clear_bit(i, &this_cpu(lr_mask));
> +
> +        return;
> +    }
> +
>      if ( lr_val.state & GICH_LR_ACTIVE )
>      {
>          set_bit(GIC_IRQ_GUEST_ACTIVE, &p->status);
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index 83569b0..b7ee105 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -470,6 +470,10 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
>      unsigned long flags;
>      bool running;
>
> +    /* If an LPI has been removed, there is nothing to inject here. */
> +    if ( !n )
> +        return;
> +
>      priority = vgic_get_virq_priority(v, virq);
>
>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
>

Cheers,
Stefano Stabellini April 7, 2017, 7:07 p.m. UTC | #2
On Fri, 7 Apr 2017, Andre Przywara wrote:
> For LPIs the struct pending_irq's are somewhat dynamically allocated and
> the pointers are stored in a radix tree. While I convinced myself that
> an invalid LPI number can't make it into the core code, people might be
> concerned about NULL pointer dereferences.
> So add checks in some places just to be on the safe side.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

This approach looks fragile: what if we miss one irq_to_pending call? We
need a way to avoid pending_irq structs being freed when an irq is still
inflight. Only after an irq is not inflight anymore, a struct
pending_irq could be freed.


> ---
>  xen/arch/arm/gic.c  | 23 +++++++++++++++++++++++
>  xen/arch/arm/vgic.c |  4 ++++
>  2 files changed, 27 insertions(+)
> 
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index da19130..44c34b1 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -405,6 +405,13 @@ void gic_remove_from_queues(struct vcpu *v, unsigned int virtual_irq)
>      struct pending_irq *p = irq_to_pending(v, virtual_irq);
>      unsigned long flags;
>  
> +    /*
> +     * If an LPIs has been removed meanwhile, it has been cleaned up
> +     * already, so nothing to remove here.
> +     */
> +    if ( !p )
> +        return;
> +
>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
>      if ( !list_empty(&p->lr_queue) )
>          list_del_init(&p->lr_queue);
> @@ -415,6 +422,10 @@ void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq)
>  {
>      struct pending_irq *n = irq_to_pending(v, virtual_irq);
>  
> +    /* If an LPI has been removed meanwhile, there is nothing left to raise. */
> +    if ( !n )
> +        return;
> +
>      ASSERT(spin_is_locked(&v->arch.vgic.lock));
>  
>      if ( list_empty(&n->lr_queue) )
> @@ -461,7 +472,19 @@ static void gic_update_one_lr(struct vcpu *v, int i)
>  
>      gic_hw_ops->read_lr(i, &lr_val);
>      irq = lr_val.virq;
> +
>      p = irq_to_pending(v, irq);
> +    /* An LPI might have been unmapped, in which case we just clean up here. */
> +    if ( !p )
> +    {
> +        ASSERT(is_lpi(irq));
> +
> +        gic_hw_ops->clear_lr(i);
> +        clear_bit(i, &this_cpu(lr_mask));
> +
> +        return;
> +    }
> +
>      if ( lr_val.state & GICH_LR_ACTIVE )
>      {
>          set_bit(GIC_IRQ_GUEST_ACTIVE, &p->status);
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index 83569b0..b7ee105 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -470,6 +470,10 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
>      unsigned long flags;
>      bool running;
>  
> +    /* If an LPI has been removed, there is nothing to inject here. */
> +    if ( !n )
> +        return;
> +
>      priority = vgic_get_virq_priority(v, virq);
>  
>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
> -- 
> 2.9.0
>
Andre Przywara April 7, 2017, 8:46 p.m. UTC | #3
On 07/04/17 20:07, Stefano Stabellini wrote:
> On Fri, 7 Apr 2017, Andre Przywara wrote:
>> For LPIs the struct pending_irq's are somewhat dynamically allocated and
>> the pointers are stored in a radix tree. While I convinced myself that
>> an invalid LPI number can't make it into the core code, people might be
>> concerned about NULL pointer dereferences.
>> So add checks in some places just to be on the safe side.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> 
> This approach looks fragile: what if we miss one irq_to_pending call? We
> need a way to avoid pending_irq structs being freed when an irq is still
> inflight. Only after an irq is not inflight anymore, a struct
> pending_irq could be freed.

Indeed. I was wondering if a dummy pend_irq could help on the first
step: Upon unmapping, we replace the radix-tree member(s) with this one
reserved instance (per domain), that would avoid the NULL pointer
dereference.
Then we keep the existing pend_irq (array) around until we are sure that
no-one is holding a reference anymore - either by using RCU (although I
think this is problematic because of the rcu_read_lock) or by finding a
definite point in time when no-one can possibly use that pointer
anymore. The pointer usage seems to be very confined, so I am hopeful we
can find such a limit (say: once every VCPU has exited and entered once
or the like).

Does that sound like a possible route?

Cheers,
Andre.


> 
> 
>> ---
>>  xen/arch/arm/gic.c  | 23 +++++++++++++++++++++++
>>  xen/arch/arm/vgic.c |  4 ++++
>>  2 files changed, 27 insertions(+)
>>
>> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
>> index da19130..44c34b1 100644
>> --- a/xen/arch/arm/gic.c
>> +++ b/xen/arch/arm/gic.c
>> @@ -405,6 +405,13 @@ void gic_remove_from_queues(struct vcpu *v, unsigned int virtual_irq)
>>      struct pending_irq *p = irq_to_pending(v, virtual_irq);
>>      unsigned long flags;
>>  
>> +    /*
>> +     * If an LPIs has been removed meanwhile, it has been cleaned up
>> +     * already, so nothing to remove here.
>> +     */
>> +    if ( !p )
>> +        return;
>> +
>>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
>>      if ( !list_empty(&p->lr_queue) )
>>          list_del_init(&p->lr_queue);
>> @@ -415,6 +422,10 @@ void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq)
>>  {
>>      struct pending_irq *n = irq_to_pending(v, virtual_irq);
>>  
>> +    /* If an LPI has been removed meanwhile, there is nothing left to raise. */
>> +    if ( !n )
>> +        return;
>> +
>>      ASSERT(spin_is_locked(&v->arch.vgic.lock));
>>  
>>      if ( list_empty(&n->lr_queue) )
>> @@ -461,7 +472,19 @@ static void gic_update_one_lr(struct vcpu *v, int i)
>>  
>>      gic_hw_ops->read_lr(i, &lr_val);
>>      irq = lr_val.virq;
>> +
>>      p = irq_to_pending(v, irq);
>> +    /* An LPI might have been unmapped, in which case we just clean up here. */
>> +    if ( !p )
>> +    {
>> +        ASSERT(is_lpi(irq));
>> +
>> +        gic_hw_ops->clear_lr(i);
>> +        clear_bit(i, &this_cpu(lr_mask));
>> +
>> +        return;
>> +    }
>> +
>>      if ( lr_val.state & GICH_LR_ACTIVE )
>>      {
>>          set_bit(GIC_IRQ_GUEST_ACTIVE, &p->status);
>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
>> index 83569b0..b7ee105 100644
>> --- a/xen/arch/arm/vgic.c
>> +++ b/xen/arch/arm/vgic.c
>> @@ -470,6 +470,10 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
>>      unsigned long flags;
>>      bool running;
>>  
>> +    /* If an LPI has been removed, there is nothing to inject here. */
>> +    if ( !n )
>> +        return;
>> +
>>      priority = vgic_get_virq_priority(v, virq);
>>  
>>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
>> -- 
>> 2.9.0
>>
Julien Grall April 7, 2017, 8:58 p.m. UTC | #4
Hi Andre,

On 04/07/2017 09:46 PM, André Przywara wrote:
> On 07/04/17 20:07, Stefano Stabellini wrote:
>> On Fri, 7 Apr 2017, Andre Przywara wrote:
>>> For LPIs the struct pending_irq's are somewhat dynamically allocated and
>>> the pointers are stored in a radix tree. While I convinced myself that
>>> an invalid LPI number can't make it into the core code, people might be
>>> concerned about NULL pointer dereferences.
>>> So add checks in some places just to be on the safe side.
>>>
>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>>
>> This approach looks fragile: what if we miss one irq_to_pending call? We
>> need a way to avoid pending_irq structs being freed when an irq is still
>> inflight. Only after an irq is not inflight anymore, a struct
>> pending_irq could be freed.
>
> Indeed. I was wondering if a dummy pend_irq could help on the first
> step: Upon unmapping, we replace the radix-tree member(s) with this one
> reserved instance (per domain), that would avoid the NULL pointer
> dereference.

The problem with that is the code will continue to handle it, but as it 
is a dummy pending_irq (I understand there will be only one existing) 
you will corrupt the list.

> Then we keep the existing pend_irq (array) around until we are sure that
> no-one is holding a reference anymore - either by using RCU (although I
> think this is problematic because of the rcu_read_lock) or by finding a
> definite point in time when no-one can possibly use that pointer
> anymore. The pointer usage seems to be very confined, so I am hopeful we
> can find such a limit (say: once every VCPU has exited and entered once
> or the like).

I think the RCU case is fragile because the pending_irq could be added 
in the list which will be carried even after re-entering to the guest.

>
> Does that sound like a possible route?

How about using a reference (either on the device or the pending_irq)? A 
reference would be taken until the vLPI is correctly handled (i.e clear 
from the LRs).

This would prevent complex locking.

Cheers,
Stefano Stabellini April 7, 2017, 9:45 p.m. UTC | #5
On Fri, 7 Apr 2017, Julien Grall wrote:
> Hi Andre,
> 
> On 04/07/2017 09:46 PM, André Przywara wrote:
> > On 07/04/17 20:07, Stefano Stabellini wrote:
> > > On Fri, 7 Apr 2017, Andre Przywara wrote:
> > > > For LPIs the struct pending_irq's are somewhat dynamically allocated and
> > > > the pointers are stored in a radix tree. While I convinced myself that
> > > > an invalid LPI number can't make it into the core code, people might be
> > > > concerned about NULL pointer dereferences.
> > > > So add checks in some places just to be on the safe side.
> > > > 
> > > > Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > > 
> > > This approach looks fragile: what if we miss one irq_to_pending call? We
> > > need a way to avoid pending_irq structs being freed when an irq is still
> > > inflight. Only after an irq is not inflight anymore, a struct
> > > pending_irq could be freed.
> > 
> > Indeed. I was wondering if a dummy pend_irq could help on the first
> > step: Upon unmapping, we replace the radix-tree member(s) with this one
> > reserved instance (per domain), that would avoid the NULL pointer
> > dereference.
> 
> The problem with that is the code will continue to handle it, but as it is a
> dummy pending_irq (I understand there will be only one existing) you will
> corrupt the list.

> > Does that sound like a possible route?
> 
> How about using a reference (either on the device or the pending_irq)? A
> reference would be taken until the vLPI is correctly handled (i.e clear from
> the LRs).
> 
> This would prevent complex locking.

We need to mark the pending_irq "to be freed", like we do with migration
(GIC_IRQ_GUEST_MIGRATING marks an irq to be migrated). Afterwards, when
the irq is ready, it will removed from the tree and freed.
Stefano Stabellini April 7, 2017, 10:09 p.m. UTC | #6
On Fri, 7 Apr 2017, Stefano Stabellini wrote:
> On Fri, 7 Apr 2017, Julien Grall wrote:
> > Hi Andre,
> > 
> > On 04/07/2017 09:46 PM, André Przywara wrote:
> > > On 07/04/17 20:07, Stefano Stabellini wrote:
> > > > On Fri, 7 Apr 2017, Andre Przywara wrote:
> > > > > For LPIs the struct pending_irq's are somewhat dynamically allocated and
> > > > > the pointers are stored in a radix tree. While I convinced myself that
> > > > > an invalid LPI number can't make it into the core code, people might be
> > > > > concerned about NULL pointer dereferences.
> > > > > So add checks in some places just to be on the safe side.
> > > > > 
> > > > > Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > > > 
> > > > This approach looks fragile: what if we miss one irq_to_pending call? We
> > > > need a way to avoid pending_irq structs being freed when an irq is still
> > > > inflight. Only after an irq is not inflight anymore, a struct
> > > > pending_irq could be freed.
> > > 
> > > Indeed. I was wondering if a dummy pend_irq could help on the first
> > > step: Upon unmapping, we replace the radix-tree member(s) with this one
> > > reserved instance (per domain), that would avoid the NULL pointer
> > > dereference.
> > 
> > The problem with that is the code will continue to handle it, but as it is a
> > dummy pending_irq (I understand there will be only one existing) you will
> > corrupt the list.
> 
> > > Does that sound like a possible route?
> > 
> > How about using a reference (either on the device or the pending_irq)? A
> > reference would be taken until the vLPI is correctly handled (i.e clear from
> > the LRs).
> > 
> > This would prevent complex locking.
> 
> We need to mark the pending_irq "to be freed", like we do with migration
> (GIC_IRQ_GUEST_MIGRATING marks an irq to be migrated). Afterwards, when
> the irq is ready, it will removed from the tree and freed.

I'll clarify. Andre, give a look at how vgic_migrate_irq is implemented:
first it tries to migrate the irq immediately, but if it is not
possible, it will mark the irq "to be migrated". Then, in
gic_update_one_lr, we check for the "to be migrated" flag and do the
actual migration.

We need to introduce a similar workflow here. If we can remove
pending_irq from the tree, we do it immediately. Otherwise we mark it
"to be removed" and do it later as soon as we can. We also add check for
the "to be removed" flag in other places, where appropriate, to avoid
trying to inject another LPI that is already supposed to be removed. I
hope that's clearer.
Andre Przywara April 7, 2017, 10:12 p.m. UTC | #7
On 07/04/17 23:09, Stefano Stabellini wrote:
> On Fri, 7 Apr 2017, Stefano Stabellini wrote:
>> On Fri, 7 Apr 2017, Julien Grall wrote:
>>> Hi Andre,
>>>
>>> On 04/07/2017 09:46 PM, André Przywara wrote:
>>>> On 07/04/17 20:07, Stefano Stabellini wrote:
>>>>> On Fri, 7 Apr 2017, Andre Przywara wrote:
>>>>>> For LPIs the struct pending_irq's are somewhat dynamically allocated and
>>>>>> the pointers are stored in a radix tree. While I convinced myself that
>>>>>> an invalid LPI number can't make it into the core code, people might be
>>>>>> concerned about NULL pointer dereferences.
>>>>>> So add checks in some places just to be on the safe side.
>>>>>>
>>>>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>>>>>
>>>>> This approach looks fragile: what if we miss one irq_to_pending call? We
>>>>> need a way to avoid pending_irq structs being freed when an irq is still
>>>>> inflight. Only after an irq is not inflight anymore, a struct
>>>>> pending_irq could be freed.
>>>>
>>>> Indeed. I was wondering if a dummy pend_irq could help on the first
>>>> step: Upon unmapping, we replace the radix-tree member(s) with this one
>>>> reserved instance (per domain), that would avoid the NULL pointer
>>>> dereference.
>>>
>>> The problem with that is the code will continue to handle it, but as it is a
>>> dummy pending_irq (I understand there will be only one existing) you will
>>> corrupt the list.
>>
>>>> Does that sound like a possible route?
>>>
>>> How about using a reference (either on the device or the pending_irq)? A
>>> reference would be taken until the vLPI is correctly handled (i.e clear from
>>> the LRs).
>>>
>>> This would prevent complex locking.
>>
>> We need to mark the pending_irq "to be freed", like we do with migration
>> (GIC_IRQ_GUEST_MIGRATING marks an irq to be migrated). Afterwards, when
>> the irq is ready, it will removed from the tree and freed.
> 
> I'll clarify. Andre, give a look at how vgic_migrate_irq is implemented:
> first it tries to migrate the irq immediately, but if it is not
> possible, it will mark the irq "to be migrated". Then, in
> gic_update_one_lr, we check for the "to be migrated" flag and do the
> actual migration.
> 
> We need to introduce a similar workflow here. If we can remove
> pending_irq from the tree, we do it immediately. Otherwise we mark it
> "to be removed" and do it later as soon as we can. We also add check for
> the "to be removed" flag in other places, where appropriate, to avoid
> trying to inject another LPI that is already supposed to be removed. I
> hope that's clearer.

Indeed, that sounds like a good idea. Thanks for that!
Now we just need to find a neat way of dealing with the array nature of
our pend_irq allocation, so we can only free it when *all* LPIs from
that device have been removed. But I am confident to find something.

Cheers,
Andre.
diff mbox

Patch

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index da19130..44c34b1 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -405,6 +405,13 @@  void gic_remove_from_queues(struct vcpu *v, unsigned int virtual_irq)
     struct pending_irq *p = irq_to_pending(v, virtual_irq);
     unsigned long flags;
 
+    /*
+     * If an LPIs has been removed meanwhile, it has been cleaned up
+     * already, so nothing to remove here.
+     */
+    if ( !p )
+        return;
+
     spin_lock_irqsave(&v->arch.vgic.lock, flags);
     if ( !list_empty(&p->lr_queue) )
         list_del_init(&p->lr_queue);
@@ -415,6 +422,10 @@  void gic_raise_inflight_irq(struct vcpu *v, unsigned int virtual_irq)
 {
     struct pending_irq *n = irq_to_pending(v, virtual_irq);
 
+    /* If an LPI has been removed meanwhile, there is nothing left to raise. */
+    if ( !n )
+        return;
+
     ASSERT(spin_is_locked(&v->arch.vgic.lock));
 
     if ( list_empty(&n->lr_queue) )
@@ -461,7 +472,19 @@  static void gic_update_one_lr(struct vcpu *v, int i)
 
     gic_hw_ops->read_lr(i, &lr_val);
     irq = lr_val.virq;
+
     p = irq_to_pending(v, irq);
+    /* An LPI might have been unmapped, in which case we just clean up here. */
+    if ( !p )
+    {
+        ASSERT(is_lpi(irq));
+
+        gic_hw_ops->clear_lr(i);
+        clear_bit(i, &this_cpu(lr_mask));
+
+        return;
+    }
+
     if ( lr_val.state & GICH_LR_ACTIVE )
     {
         set_bit(GIC_IRQ_GUEST_ACTIVE, &p->status);
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 83569b0..b7ee105 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -470,6 +470,10 @@  void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
     unsigned long flags;
     bool running;
 
+    /* If an LPI has been removed, there is nothing to inject here. */
+    if ( !n )
+        return;
+
     priority = vgic_get_virq_priority(v, virq);
 
     spin_lock_irqsave(&v->arch.vgic.lock, flags);